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Gene expression in biological conditions 

Technical field of the invention 

5 The present invention relates to a method of predicting the prognosis of a biological 
condition in animal tissue, wherein the expression of genes is examined and correlated to 
standards. The invention further relates to the treatment of the biological condition and an 
assay for predicting the prognosis. 

10 Background 

The building of large databases containing human genome sequences is the basis for 
studies of gene expressions in various tissues during normal physiological and pathological 
conditions. Constantly (constitutively) expressed sequences as well as sequences whose 

15 expression is altered during disease processes are important for our understanding of 
cellular properties, and for the identification of candidate genes for future therapeutic 
intervention. As the number of known genes and ESTs build up in the databases, array- 
based simultaneous screening of thousands of genes is necessary to obtain a profile of 
transcriptional behaviour, and to identify key genes that either alone or in combination with 

20 other genes, control various aspects of cellular life. One cellular behaviour that has been a 
mystery for many years is the malignant behaviour of cancer cells. It is now known that for 
example defects in DNA repair can lead to cancer but the cancer-creating mechanism in 
heterozygous individuals is still largely unknown as is the malignant cell's ability to repeat 
cell cycles to avoid apoptosis to escape the immune system to invade and metastasize and 

25 to escape therapy. There are indications in these areas and excellent progress has been 
made, buth the myriad of genes interacting with each other in a highly complex 
multidimensional network is making the road to insight long and contorted. 

Similar appearing tumors - morphologically, histochemically, microscopically - can be 
30 profoundly different. They can have different invasive and metastasizing properties, as well 
as respond differently to therapy. There is thus a need in the art for methods which 
distinguish tumors and tissues on factors different than those currently in clinical use. 
The malignant transformation from normal tissue to cancer is believed to be a multistep 
process, in which tumorsuppressor genes, that normally repress cancer growth show re- 
35 duced gene expression and in which other genes that encode tumor 
promoting proteins (oncogenes) show an increased expression level. Several tumor sup- 
pressor genes have been identified up till now, as e.g. p16, Rb, p53 ( Nesrin OzSren and 
Wafik S. El-Deiry, Introduction to cancer genes and growth control, In: DNA alterations in 
cancer, genetic and epigenetic changes, Eaton publishing, Melanie Ehrlich (ed) p. 1-43, 
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2000.; and references therein). They are usually identified by their lack of expression or their 
mutation in cancer tissue. 



Other examinations have shown this downregulation of transcripts to be partly due to loss of 
5 genomic material ( loss of heterozygosity), partly to methylation of promotorregions, and 
partly due to unknown factors ( Nesrin OzCren and Wafik S. El- 
Deiry, Introduction to cancer genes and growth control, In: DNA alterations in cancer, genetic 
and epigenetic changes, Eaton publishing, Melanie Ehrlich (ed) p. 1-43, 2000.; and refer- 
ences therein). 

10 

Several oncogenes are known, e.g. cyclinD1/PRAD1/BCL1, FGFs, c-MYC, BCL-2 all of 
which are genes that are amplified in cancer showing an increased level of transcript ( Nes- 
rin Ozoren and Wafik S. El-Deiry, Introduction to cancer genes 
and growth control, In: DNA alterations in cancer, genetic and epigenetic changes, Eaton 
15 publishing, Melanie Ehrlich (ed) p. 1-43, 2000.; and references therein). Many of these 
genes are related to cell growth and directs the tumor cells to uninhibited 
growth. Others may be related to tissue degradation as they e.g. encode enzymes that break 
down the surrounding connective tissue. 

20 Bladder cancer is the fourth most common malignancy in males in the western countries 
(Pisani). The disease basically takes two different courses: one where patients have multiple 
recurrences of superficial tumors (Ta and T1), and one where the disease from the begin- 
ning is muscle invasive (T2+) and leads to metastasis. About 5-10% of patients with Ta tu- 
mors and 20-30% of the patients with T1 tumors will eventually develop a higher stage tumor 

25 (Wolf). Patients with superficial bladder tumors represent 75% of all bladder cancer patients 
and no clinical useful markers identifying patients with a poor prognosis exists at present. 

The patients presenting isolated or concomitant Carcinoma in situ (CIS) lesions have a high 
risk of disease progression to a muscle invasive stage (Althausen). The CIS lesions may 

30 have a widespread manifestation in the bladder (field disease) and are believed to be the 
most common precursors of invasive carcinomas (Spruck, Rosin). The ability to predict 
which tumours are likely to recur or progress would have great impact on the clinical 
management of patients with superficial disease, as it would be possible to treat high-risk 
patients more aggressively (e.g. radical cystectomy or adjuvant therapy). This approach is 

35 currently not possible, as no clinical useful markers exist that identify these patients. 
Although many prognostic markers have been investigated, the most important prognostic 
factors are still disease stage, dysplasia grade and especially the presence of areas with CIS 
(Anderstrom, Cummings, Cheng). The gold standard for detection of CIS is urine cytology 
and histopathologic analysis of a set of selected site biopsies removed during routine 
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cytsocopy examinations; however these procedures are not sufficient sensitive. 
Implementing routine cytoscopy examinations with 6-ALA fluorescence imaging of the 
tumours and pre-cancerous lesions (CIS lesions and moderate dysplasia lesions) may 
increase the sensitivity of the procedure (Kriegmar), however, increased detection sensitivity 
5 is still necessary in order to offer better treatment regiments to the individual patients. 



Summary of the invention 



The present invention relates to prediction of prognosis of a biological condition, in particular 
10 to the prognosis of cancer such as bladder cancer. It is known that individuals suffering from 
cancer, although their tumors macroscopically and microscopically are identical, may have 
very different outcome. The present inventors have identified new predictor genes to classify 
macroscopically and microscopically identical tumors into two or more groups, wherein in 
each group has a separate risk profile of recurrence, invasive growth, metastasis etc. as 
15 compared to the other group(s). The present invention relates to genotyping of the tissue, 
and correlating the result to standard expression level(s) to predict the prognosis of the bio- 
logical condition. 

Accordingly, in one aspect the present invention relates to a method of predicting the prog- 
20 nosis of a biological condition in animal tissue, 



comprising collecting a sample comprising cells from the tissue and/or expression prod- 
ucts from the cells, 



25 determining an expression level of at least one gene in said sample, said gene being se- 

lected from the group of genes consisting of gene No. 1 to gene No. 562, 

correlating the expression level to at least one standard expression level to predict the 
prognosis of the biological condition in the animal tissue. 

30 

The genes No. 1 - gene No. 562 are found in table A described below herein. 

Animal tissue may be tissue from any animal, preferably from a mammal, such as a horse, a 
cow, a dog, a cat, and more preferably the tissue is human tissue. The biological condition 
35 may be any condition exhibiting gene expression different from normal tissue. In particular 
the biological condition relates to a malignant or premaiignant condition, such as a tumor or 
cancer, in particular bladder cancer. By the term "collecting a sample comprising cells" is 
meant the sample is provided in a manner, so that the expression level of the genes may be 
determined. 
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Furthermore, the invention relates to a method of determining the stage of a biological con- 
dition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

determining an expression level of at least one gene in said sample, said gene being se- 
lected from the group of genes consisting of geneNo 1 to gene No. 562, 

correlating the expression level of the assessed genes to at least one standard level of 
expression determining the stage of the condition. 

The determination of the stage of the biological condition may be conducted prior to the 
method of predicting the method, or the stage of the biological condition may as such contain 
the information about the prognosis. 

The methods above may be used for determining single gene expressions, however the 
invention also relates to a method of determining an expression pattern of a bladder cell 
sample, comprising: 

collecting sample comprising bladder cells and/or expression products from bladder 
cells, 

determining the expression level of at least one gene in the sample, said gene being se- - 
lected from the group of genes consisting of gene No. 1 to gene No. 562, and obtaining 
an expression pattern of the bladder cell sample. 

Further, the invention relates to a method of determining an expression pattern of a bladder 
cell sample independent of the proportion of submucosal, muscle, or connective tissue cells 
present, comprising: 

determining the expression of one or more genes in a sample comprising cells, wherein 
the one or more genes exclude genes which are expressed in the submucosal, muscle, 
or connective tissue, whereby a pattern of expression is formed for the sample which is 
independent of the proportion of submucosal, muscle, or connective tissue cells in the 
sample. 

The expression pattern may be used in a method according to this information, and accord- 
ingly, the invention also relates to a method of predicting the prognosis a biological condition 
in human bladder tissue comprising, 
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collecting a sample comprising cells from the tissue, 

determining an expression pattern of the cells as defined in any of claims 43-54, 

5 

correlating the determined expression pattern to a standard pattern, 

predicting the prognosis of the biological condition of said tissue 

10 as well as a method for determining the stage of a biological condition in animal tissue, 

comprising 

collecting a sample comprising cells from the tissue, 
1 5 determining an expression pattern of the cells as defined above, 

correlating the determined expression pattern to a standard pattern, 
determining the stage of the biological condition is said tissue. 

20 

The invention further relates to a method for reducing cell tumorigenicity or malignancy of a 
cell, said method comprising 

contacting a tumor cell with at least one peptide expressed by at least one gene selected 
25 from the group of genes consisting of gene Nos. 200-214, 233, 234, 235, 236, 244, 249, 
251, 252. 255, 256, 259, 261, 262, 266, 268, 269, 273, 274, 275, 276, 277, 279, 280, 281, 
282, 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 307, 308, 311, 312, 313, 314 , 320 , 
322, 323, 325, 326, 327, 328 , 330, 331, 332, 333, 334, 338, 341, 342, 343, 345, 348, 349, 
350, 351, 352, 353, 355, 357, 360, 361, 363, 366, 367, 370, 373, 374, 375, 376, 385, 386, 
30 387. 389, 390, 392, 394. 398, 400, 401, 405, 406, 407, 408, 410, 411, 412, 414, 415, 416, 
418, 424, 426, 428, 433, 434, 435, 436, 438, 439, 440, 441, 442, 443, 445, 446, 453, 460, 
461, 463. 464, 465, 466. 467, 469, 470, 471, 472, 473, 475, 476, 477, 479, 480, 481, 482, 
483, 485, 486, 487, 488, 490, 492, 494, 496, 497, 498 , 499. 503, 515, 516, 517, 521, 526, 
527, 528, 530 ,532, 533, 537, 539, 540, 541 , 542, 543, 545, 554, 557, 560 or 

35 

obtaining at least one gene selected from the group of genes consisting of gene Nos200- 
214, 233, 234, 235, 236, 244, 249, 251, 252, 255. 256, 259, 261, 262. 266, 268. 269. 273, 
274, 275, 276, 277, 279, 280, 281, 282. 285. 286, 289, 293. 295, 296, 299, 301, 304, 306, 
307, 308, 311, 312, 313, 314 , 320 , 322, 323, 325, 326, 327, 328 , 330, 331, 332, 333, 334, 
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338, 341, 342, 343, 345, 348, 349, 350, 351, 352, 353, 355, 357, 360, 361, 363, 366, 367, 
370, 373, 374, 375, 376. 385, 386, 387, 389, 390, 392, 394, 398, 400, 401, 405, 406, 407, 

408, 410, 411, 412, 414, 415, 416, 418, 424, 426, 428, 433, 434, 435, 436, 438, 439, 440, 
441, 442, 443, 445, 446, 453, 460, 461, 463, 464, 465, 466, 467, 469, 470, 471, 472, 473, 

5 475, 476, 477, 479, 480, 481, 482, 483, 485, 486, 487, 488, 490, 492, 494, 496, 497, 498 , 
499, 503, 515, 516, 517, 521, 526, 527, 528, 530 ,532, 533, 537, 539, 540, 541, 542, 543, 
545, 554, 557, 560, and introducing said at least one gene into the tumor cell in a manner 
allowing expression of said gene(s), or 

10 obtaining at least one nucleotide probe capable of hybridising with at least one gene of a 
tumor cell, said at least one gene being selected from the group of genes consisting of gene 
Nos. 1-199, 215-232, 237, 238, 239, 240, 241, 242, 243, 245, 246, 247, 248, 250, 253, 254, 
257, 258, 260, 263, 264, 265, 267, 270, 271, 272, 278, 283, 284, 287, 288, 290, 291, 292, 
294, 297, 298, 300, 302, 303, 305, 309, 310, 315, 316, 317, 318, 319, 321, 324, 329, 335, 

15 336, 337, 339, 340, 344, 346, 347, 354, 356, 358, 359, 362, 364, 365, 368, 369, 371, 372, 
377, 378, 379, 380, 381, 382, 383, 384, 388, 391, 393, 395, 396, 397, 399, 402, 403, 404, 

409, 413, 417, 419, 420, 421, 422, 423, 425, 427 ,429, 430, 431, 432, 437, 444, 447, 448, 
449, 450, 451, 452, 454, 455 ,456. 457, 458. 459, 462, 468, 474, 478, 484, 489, 491, 493, 
495, 500. 501, 502, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 518 , 519, 520, 

20 522, 523, 524, 525, 529. 531, 534, 535. 536, 538, 544, 546. 547, 548, 549, 550, 551, 552, 
553, 555, 556, 558, 559, 561, 562, and introducing said at least one nucleotide probe into 
the tumor cell in a manner allowing the probe to hybridise to the at least one gene, thereby 
inhibiting expression of said at least one gene. 

25 In a further aspect the invention relates to a method for producing antibodies against an 
expression product of a cell from a biological tissue, said method comprising the steps of 

obtaining expression produces) from at least one gene said gene being expressed as 
defined above, 

30 

immunising a mammal with said expression product(s) obtaining antibodies against the 
expression product. 

The antibodies produced may be used for producing a pharmaceutical composition. Further, 
35 the invention relates to a vaccine capable of eliciting an immune response against at least 
one expression product from at least one gene said gene being expressed as defined above. 

The invention furthermore relates to the use of any of the methods discussed above for 
producing an assay for diagnosing a biological condition in animal tissue. 
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Also, the invention relates to the use of a peptide as defined above as an expression product 
and/or the use of a gene as defined above and/or the use of a probe as defined above for 
preparation of a pharmaceutical composition for the treatment of a biological condition in 
5 animal tissue. 

In yet a further aspect the invention relates to an assay for determining the presence or ab- 
sence of a biological condition in animal tissue, comprising 

10 at least one first marker capable of detecting an expression level of at least one gene se- 

lected from the group of genes consisting of gene No. 1 to gene No. 562, 

In another aspect the invention relates to an assay for determining an expression pattern of 
a bladder cell, comprising at least a first marker and and/or a second marker, wherein the 
15 first marker is capable of detecting a gene from a first gene group as defined above, and the 
second marker is capable of detecting a gene from a second gene group as defined above. 

Drawings 

20 Description of figures: 

Figure 1 Hierarchical cluster analysis of tumor samples based on 3,197 genes that show 
large variation across all tumor samples. Samples with progression are marked Prog. 

Figure 2 Delineation of the 200 best marker genes. Genes that show higher levels of 
expression in the non-progression group are shown in the top and genes that show higher 
levels of expression in the progression group is shown in the bottom. Each column in the 
diagram represents a tumor sample and each row a gene. The 13 non-progressing samples 
are shown to the left and the 16 progressing samples are shown to the right in the diagram. 
The color saturation indicates differences in gene expression across the tumor samples; light 
color indicates up regulation compared the median expression and down regulation 
compared to the median expression of the gene is shown in dark color. Gene names of 
particular interesting genes are listed. Notable, non-group expression patterns were 
observed for two tumors (arrows). The tumor in the no progression group (150-6) showed a 
solid growth pattern, which is associated with a poor prognosis. No special tumor 
characteristics can help explain the gene expression pattern observed for the tumor in the 
progression group (825-3). 

Figure 3. Cross-validation performance using from 1 to 200 genes. 



25 



30 



35 
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Figure 4. Predicting progression in eariy stage bladder tumors, a, The 45-gene expression 
signature found to be optimal for progression prediction. Genes showing high expression in 
progressing samples are show in the top and genes showing high expression in the non- 

5 progressing samples are shown in the bottom. Genes are listed according to how many 
cross-validation loops included the genes, b, The 45-gene expression signature in the 19 
tumor test-set. The samples are listed according to the correlation to the average non- 
progression signature from the training set samples. The read punctuated line separates 
samples with positive (left) and negative (right) correlation values. The white lines separates 

10 samples above and below the correlation cutoff values of 0.1 and -0.1. The sample legend 
indicates no-progression (N) samples and progression (P) samples. 

Figure 5 Hierarchical cluster analysis of the metachronous tumor samples. Tight clustering 
tumors of different stage from the same patients are colored in grey. 

15 

Figure 6 Two-way hierarchical clustering and multidimensional scaling analysis of gene 
expression data from 40 bladder tumour biopsies, a, Tumour cluster dendrogram based on 
the 1767 gene-set. CIS annotations following the sample names indicate concomitant 
carcinoma in situ. Tumour recurrence rates are shown to the right of the dendrogram as + 

20 and ++ indicating moderate and high recurrence rates, respectively, while no sign indicates 
no or moderate recurrence, b, Tumour cluster dendrogram based on 88 cancer related 
genes, c, 2D plot of multidimensional scaling analysis of the 40 tumours based on the 1767 
gene-set. The colour code identifies the tumour samples from the cluster dendrogram (Fig. 
1a). d, Two-way cluster analysis diagram of the 1767 gene-set. Each row in the diagram 

25 represents a gene and each column a tumour sample. The colour saturation represents 
differences in gene expression across the tumour samples; Igiht color indicates higher 
expression of the gene compared to the median expression and lower expression of the 
gene compared to the median expression shown in dark color. The colour intensities indicate 
degrees of gene-regulation. The sidebars to the right of the diagram represent gene clusters 

30 a-j and normal 1-3 in the left side indicate the three normal biopsies and normal 4 indicates 
the pool of biopsies from 37 patients. 

Figure 7 Enlarged view of the gene clusters a, c, f, and g. The dendrogram at the top is 
identical to Fig. 6a. a, Cluster of transcription factors and other nuclear associated genes, c, 
35 Cluster of genes involved in proliferation and ceil cycle control, f, Gene expression pattern 
and corresponding area with squamous metaplasia in urothelial carcinoma. The light colour 
indicates genes up-regulated in samples 1178-1 and 875-1, the only two samples with 
squamous cell metaplasia, g, Cluster of genes involved in angiogenesis and matrix 
remodelling. 
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Figure 8. Hierarchical cluster analysis results 

Here we show expanded views of clusters a-j as identified in the 1767 gene-cluster. The 
tumour cluster dendrogram and colour bars on top of the clusters represents the same 
5 tumour cluster as shown in the paper. The four samples to the left are normal biopsies 
(normal 1-3) and a pool of 37 normal biopsies (normal 4). 

Figure 8a. Molecular classification of tumour samples using 80 predictive genes in each 
cross-validation loop. Each classification is based on the closeness to the mean in the three 
10 classes. Samples marked with * were not used to build the classifier. The scale indicates the 
distance from the samples to the classes in the classifier, measured in weighted squared 
Euclidean distance. 

Figure 9 Number of classification errors vs. number of genes used in cross-validation loops. 

15 

Figure 10 Expression profiles of the 71 genes used in the final classifier model.The tumors 
shown are the 33 tumors used in the cross validation scheme. The Ta tumors are shown to 
the left, the T1 tumors in the middle, and the T2 tumors to the right. 

20 Figure 1 1 Number of prediction errors vs. number of genes used in cross-validation loops. 

Figure 12 The expression profiles of the 26 genes that constitute our final prediction model. 
The genes are listed according to the degree of correlation with the recurrence and non- 
recurrence groups. Genes with highest correlations are found in the top and the bottom of 
25 the list. 

Figure 13 . Hierarchical cluster analysis of the gene expression in 41 TCC, 9 normal 
samples and 10 samples from cystectomy specimens with CIS lesions, a, Cluster 
dendrogram of all 41 TCC biopsies based on the expression of 5,491 genes, b, Cluster 

30 dendrogram of all superficial TCC biopsies based on the expression of 5,252 genes, c, Two- 
way cluster analysis diagram of the 41 TCC biopsies together with gene expressions in the 
normal and cystectomy samples (left columns). Each row represents a gene and each 
column represent a biopsy sample. Yellow indicates up-regulation compared to the median 
expression (black) of the gene and blue indicates down-regulation compared to the median 

35 expression. The colour saturation indicates degree of gene regulation. The sidebars to the 
right of the diagram represent gene-clusters 1-4; enlarged views of cluster 1 and 4 are 
shown to the right, with all gene symbols listed. 
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Figure 14 . Delineation of the 100 best markers that separate TCC without CIS from TCC 
with concomitant CIS. a, The 50 best up-regulated marker genes in TCC without CIS are 
shown in the top and the 50 best up-regulated marker genes in TCC with CIS are shown in 
the bottom. The gene symbols are listed to the right of the diagram, b, Expression profiles of 
5 the 100 marker genes in 9 normal biopsies (left column), 5 histologically normal samples 
adjacent to CIS lesions (middle column), and 5 biopsies with CIS lesions detected, (right 
column). 

Figure 15 Cross validation performance using all samples 

10 

Figure 16 Expression profiles of the 16 genes in the CIS classifier, a, the expression of the 
16 classifier genes in TCC with no surrounding CIS (left) and in TCC with surrounding CIS 
(right). The gene symbols of the classifier genes are listed together with the number of the 
times used in cross-validation loops, b, the expression of the 16 classifier genes in normal 
15 samples, in histologically normal samples adjacent to CIS lesions, and in biopsies with CIS 
lesions. The top dendrogram shows the sample clustering from hierarchical cluster analysis 
based on the 16 classifier genes. The genes appear in the same order as in 3a. 

Figure 17 Cross validation performance using half of the samples 

20 

Figure 18 shows table B 
Figure 19 shows table C 
25 Figure 20 shows table D 
Figure 21 shows table E 
Figure 22 shows table F 

30 

Figure 23 shows table G 

Figure 24 shows table H 

35 Detailed description of the Invention 

As discussed above the present invention relates to the finding that it is possible to predict 
the prognosis of a biological condition by determining the expression level of one or more 
genes from a specified group of genes and comparing the expression level to at least one 
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standard for expression levels. The present inventors have identified 562 genes relevant for 
predicting the prognosis of a biological condition, in particular a cancer disease, such as 
bladder cancer. 



The following table A shows the genes relevant in this context. Whenever a gene is cited 
herein with reference to a gene No. the numbering refers to the genes of Table A. 



Table A 
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HUGeneFL 


AB000220_at 
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Hs.75618 


3 


HUGeneFL 
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Hs.99855 
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HUGeneFL 
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D11086_at 
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HUGeneFL 
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Hs.1 36348 
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Hs.84728 
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Hs. 169998 
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description 

sema domain, immunoglobulin domain (Ig), 
short basic domain, secreted, (semaphorin) 

3C 

RAB11A, member RAS oncogene family 
fonmyl peptide receptor-like 1 
chemokine (C-C motif) receptor 1 
interleukin 2 receptor, gamma (severe com- 
bined immunodeficiency) 
endothelin receptor type A 
phosphatidyl inositol glycan, class F 
osteoblast specific factor 2 (fasciclin Mike) 
Kruppel-like factor 5 (intestinal) 
bone marrow stromal cell antigen 1 
solute carrier family 1 (glial high affinity gluta- 
mate transporter), member 3 
DNA2 DNA replication helicase 2-like (yeast) 
adipose specific 2 
chemokine (C-C motif) ligand 1 1 
transcription elongation factor A (SI I), 2 
tweety homolog 2 (Drosophila) 
protein tyrosine phosphatase, receptor type, R 
ficolin (collagen/fibrinogen domain containing) 

1 

MYC-associated zinc finger protein (purine- 
binding transcription factor) 
chromosome 21 open reading frame 33 
AE binding protein 1 
likely ortholog of mouse septin 8 
Ste20-related serine/threonine kinase 
minor histocompatibility antigen HA-1 
stabilin 1 
sorting nexin 19 
KIAA0241 protein 
Src-like-adaptor 
msh homeo box homolog 2 (Drosophila) 
collagen, type V, alpha 1 
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cytochrome P450, family 4, subfamily B, 
polypeptide 1 

secreted protein, acidic, cysteine-rich (os- 
teonectin) 



transforming growth factor, beta 3 
platelet-derived growth factor receptor, beta 

polypeptide 



integrin, alpha M (complement component 
receptor 3, alpha; also known as CD11b 
(p170), macrophage antigen alpha polypep- 
tide) 

carbonyl reductase 1 
electron-transfer-flavoprotein, alpha polypep- 
tide (glutaric aciduria II) 
chemokine (C-C motif) ligand 4 



Fc fragment of IgG, low affinity Ilia, receptor 

for(CD16) 

lectin, galactoside-binding, soluble, 1 (galectin 

D 

aspartyl-tRNA synthetase 
matrix metalloprotelnase 9 (gelatinase B, 
92kDa gelatinase, 92kDa type (V collagenase) 
polymerase (RNA) II (DNA directed) polypep- 
tide C, 33kDa 

serine (or cysteine) proteinase inhibitor, clade 
A (alpha-1 antiproteinase, antitrypsin), mem- 
ber 1 

chemokine (C-X-C motif) receptor 4 
protease inhibitor 3, skin-derived (SKALP) 
regulator of G-protein signalling 2, 24kDa 
growth arrest-specific 1 
growth arrest-specific 6 
fibrillin 1 (Marfan syndrome) 
von Hippel-Undau syndrome 
RNA binding protein with multiple splicing 
aryl hydrocarbon receptor 
tight junction protein 2 (zona occludens 2) 
procollagen C-endopeptidase enhancer 
thyroid receptor interacting protein 15 
peroxisome proliferative activated receptor, 

gamma 



retinol binding protein 1 , cellular 
collagen, type V, alpha 2 
tropomyosin 2 (beta) 
argininosuccinate lyase 
integrin, beta 2 (antigen CD18 (p95), lympho- 
cyte function-associated antigen 1; macro- 
phage antigen 1 (mac-1) beta subunit) 
hemopoietic cell kinase 
guanine nucleotide binding protein (G protein), 
alpha inhibiting activity polypeptide 1 
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protein phosphatase 3 (formerly 2B), catalytic 
subunit, beta isoform (caJcineurin A beta) 
tumor necrosis factor, alpha-induced protein 6 
neutrophil cytosolic factor 2 (65kDa, chronic 
granulomatous disease, autosomal 2) 
Fc fragment of IgE, high affinity I, receptor for; 

gamma polypeptide 
CD53 antigen 
CD48 antigen (B-cell membrane protein) 
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collagen, type I, alpha 1 
chemokine (C-X-C motif) Hgand 2 
acyloxyacyl hydrolase (neutrophil) 

monoamine oxidase A 
chemokine (C-C motif) Hgand 4 



transforming growth factor, beta-induced, 

68kDa 

E74-like factor 1 (ets domain transcription 

factor) 

LPS-responsive vesicle trafficking, beach and 
anchor containing 
connective tissue growth factor 
actinln, alpha 1 
natural killer ceil group 7 sequence 
KruppeMike factor 3 (basic) 
cell division cycle 25B 
nucleotide binding protein 1 (MinD homolog, 

E. coli) 

G-rich RNA sequence binding factor 1 
fibroblast activation protein, alpha 

GTP binding protein overexpressed in skeletal 

muscle 

glycerol-3-phosphate dehydrogenase 2 (mito- 
chondrial) 

chondroitln sulfate proteoglycan 2 (versican) 
lymphocyte cytosolic protein 2 (SH2 domain 
containing leukocyte protein of 76kDa) 
caspase 6, apoptosis-related cysteine prote- 
ase 

aldehyde dehydrogenase 4 family, member 

A1 

FXYD domain containing ion transport regula- 
tor 3 

complement component 3a receptor 1 
BCL2-related protein A1 
cytochrome P450, family 2, subfamily J, poly- 
peptide 2 
zinc finger protein 212 
forkhead box A1 



histamine N-methyltransferase 
cyclin G2 

2,4-dienoyl CoA reductase 1, mitochondrial 
branched chain keto acid dehydrogenase E1, 
beta polypeptide (maple syrup urine disease) 
epithelial membrane protein 3 
MAD, mothers against decapentaplegic ho- 
molog 6 (Drosophila) 
sterol-C4-methyI oxldase-iike 
mutS homolog 3 (E. coil) 
vesicle-associated membrane protein 3 (eel- 

lubrevin) 

Cbp/p300-interactJng transact! vator, with 
Glu/Asp-rich carboxy-terminal domain, 2 
SWl/SNF related, matrix associated, actin 
dependent regulator of chromatin, subfamily 

d, member 3 

MAD, mothers against decapentaplegic ho- 
molog 3 (Drosophila) 
likely ortholog of mouse myeloid ecotropic 
viral integration site-related gene 2 
bridging integrator 1 
RAB interacting factor 
neuronal PAS domain protein 2 
chemokine (C-X-C motif) ligand 6 (granulocyte 
chemotactic protein 2) 
peroxisomal biogenesis factor 7 
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high mobility group nucleosomal binding 

domain 4 

coxsackie vims and adenovims receptor 
metallothionein 2A 
metaltothionein 2A 
fibronectin 1 

cytochrome b-245, beta polypeptide (chronic 
granulomatous disease) 



pleckstrin 
CD14 antigen 
CD37 antigen 

acetyl-Coenzyme A acyltransferase 1 (perox- 
isomal 3-oxoacyl-Coenzyme A thiolase) 
collagen, type VI, alpha 1 
collagen, type VI, alpha 2 
chimerin (chimaerin) 1 
chemokine (C-X-C motif) ligand 3 



interferon induced transmembrane protein 2 

(1-8D) 

GATA binding protein 3 
WEE1 homoiog (S. pombe) 
integrin, beta 2 (antigen CD18 (p95), lympho- 
cyte function-associated antigen 1; macro- 
phage antigen 1 (mac-1) beta subunit) 
S100 calcium binding protein P 
fibroblast growth factor receptor 1 (fms-related 
tyrosine kinase 2, Pfetffer syndrome) 
glutamate dehydrogenase 1 
synaptophysin-like protein 
microtubule-associated protein 7 
chloride channel 3 
PTK6 protein tyrosine kinase 6 
tenascin C (hexabrachion) 
reticulocalbin 2, EF-hand calcium binding 

domain 

3-hydroxy-3-methylg!utaryl-Coenzyme A 
synthase 2 (mitochondrial) 
phosphorylase kinase, beta 
fatty acid binding protein 6, ileal (gastrotropln) 
ADP-ribosylation factor related protein 1 
abl-interactor 2 



serine protease inhibitor, Kazal type 1 
interleukin 8 

protein tyrosine phosphatase, receptor type, F 

glucosamine (N-acetyl)-6-sulfatase (Sanfilippo 

disease MID) 
vimentin 

catechol-O-methyi transferase 
ubiquitin-conjugating enzyme E2H (UBC8 
homoiog, yeast) 
BCL2-asso elated athanogene 
syndecan 1 
Inorganic pyrophosphatase 2 
collagen, type I, alpha 1 
chromosome 1 open reading frame 16 

FBJ murine osteosarcoma viral oncogene 

homoiog B 
death-associated protein 6 

KIAA0196 gene product 

adhesion regulating molecule 1 
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collagen, type IV, alpha 6 

homeo box C6 

protease, serine, 1 1 (IGF binding) 

milk fat globule-EGF factor 8 protein 

skeletal muscle and kidney enriched inositol 

phosphatase 
cysteine-rich, angiogenic inducer, 61 

eukaryotic translation initiation factor 3, sub- 
unit 5 epsilon, 47kDa 
iamlnln, alpha 3 

acidic (leucine-rich) nuclear phosphoprotein 
32 family, member B 
protein tyrosine phosphatase, receptor type, N 

polypeptide 2 



DNA segment on chromosome 4 (unique) 234 
expressed sequence 
autocrine motility factor receptor 

leukotrlene B4 12-hydroxydehydrogenase 



RNA binding motif protein, X chromosome 

general transcription factor I IE, polypeptide 2, 

beta 34kDa 

5,10-methenyltetrahydrofolate synthetase (5- 
formyltetrahydrofolate cyclo-ligase) 
TAF9 RNA polymerase II, TATA box binding 
protein (TBP)-associated factor, 32kDa 
protein tyrosine phosphatase, non-receptor 

type 3 

S100 calcium binding protein A12 (calgranuiin 

C) 



keratin 6A 
keratin 6A 
keratin 6E 

small proline-rich protein 1B (comrfin) 
Human small proline rich protein (sprll) 
mRNA, done 930. 

Human small proline rich protein (sprll) 
mRNA, clone 174N. 
small proline-rich protein 2C 
S100 calcium binding protein A7 (psorlasin 1) 
keratin 16 (focal non-epldermolytic palmoplan- 
tar keratoderma) 
interleukin 13 receptor, alpha 2 
keratin 6A 

matrix metalloprotelnase 11 (stromelysin 3) 
NM_003105*:Homo sapiens sortilin-related 
receptor, L(DLR class) A repeats-containing 
(SORL1), mRNA. 
NM_003105*:Homo sapiens sortilin-related 
receptor, L(DLR class) A repeats-containing 
(SORL1), mRNA. 
NMJ)03105*:Homo sapiens sortilin-related 
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sec 
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253 EOS Hu03 
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255 EOS Hu03 

256 EOS Hu03 
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402328 
402384 
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404606 
404826 
404875 
404913 
404977 
405036 
405371 
405667 
406002 
407955 
408049 
408288 
409513 
409556 
409586 
409632 
410047 
411817 
412649 
412841 
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133 
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133 
133 

133 Hs.9343 

133 Hs.345588 

133 Hs. 16886 

133 Hs.54642 

133 Hs.54941 

133 Hs.55044 

133 Hs.55279 

133 Hs.379753 

133 Hs.72241 

133 Hs.74369 

133 Hs.101395 



receptor, L(DLR class) A repeats-containing 
(SORL1), mRNA. 
sortilin-related receptor, L(DLR dass) A re- 
peats-containing (SORL1 ) 

Target Exon 



NMJ)07181*:Homo sapiens mitogen- 
activated protein kinase kinase kinase kinase 
1 (MAP4K1), mRNA. 
C6001282:gi|4504223|ref|NP_000172.1 1 
glucuronidase, beta [Homo sapiens] 
gi]114963|sp|P082 
Target Exon 



Target Exon 



NM_022819*:Homo sapiens phospholipase 
A2, group IIP (PLA2G2F), mRNA. VERSION 
NM_020245.2 Gl 
NM_024408*:Homo sapiens Notch (Droso- 
phiia) homolog 2 (NOTCH2), mRNA. VER- 
SION NM_024410.1 Gl 
Insulin-like growth factor 2 (somatomedin A) 

(IGF2) 

NM_021628*:Homo sapiens arachldonate 
ilpoxygenase 3 (ALOXE3), mRNA. VERSION 
NMJ)20229.1 Gl 
NMJ)05569*:Homo sapiens LIM domain 
kinase 2 (L1MK2), transcript variant 2a, 

mRNA. 
Target Exon 



Target Exon 



ESTs 



desmoplakin (DPI, DPII) 



gb:z!73d06.r1 Stratagene colon (937204) 
Homo sapiens cDNA clone 5\ mRNA se- 
quence 

methionine adenosyitransferase II, beta 



phosphoryiase kinase, alpha 2 (liver) 

DKFZP586H2123 protein 

serine (or cysteine) proteinase inhibitor, clade 
B (ovalbumin), member 5 

zinc finger protein 36 (KOX 18) 
mitogen-actlvated protein kinase kinase 2 
Integrin, alpha 7 
hypothetical protein MGC11352 
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413564 133 gb:601146990F1 NIH_MGC_19 Homo 

sapiens cDNA clone 5\ mRNA sequence 

413786 133 Hs.13500 ESTs 



41 3840 1 33 Hs.356228 RNA binding motif protein, X chromosome 

41 3929 1 33 Hs.75617 collagen, type IV, alpha 2 

414223 133 Hs.238246 hypothetical protein FU22479 

414732 1 33 Hs.771 52 minichromosome maintenance deficient (S. 

cerevisiae) 7 

414762 133 Hs.77257 KIAA0068 protein 

414840 133 Hs.23823 hairy/enhancer-of-split related with YRPW 

motif-iike 

41 4843 1 33 Hs.77492 heterogeneous nuclear ribonucleoproteln AO 

414895 133 Hs.116278 Homo sapiens cDNA FU13571 fis, clone 

PLACE 1008405 

414907 133 Hs.77597 polo (Drosophia)-Uke kinase 

414918 133 Hs.72222 hypothetical protein FL J 13459 

415200 133 Hs.78202 SWI/SNF related, matrix associated, actin 

dependent regulator of chromatin, subfamily 

a, member 4 

416640 133 Hs.79404 neuron-specific protein 

416815 133 Hs.80120 UDP-N-acetyl-aJpha-D- 

galactosamine: polypeptide N- 
acetylgalactosaminyl transferase 1 (GalNAc- 

T1) 

416977 133 Hs.406103 hypothetical protein FKSG44 

41 761 5 1 33 Hs.8231 4 hypoxanthlne phosphoribosyltransferase 1 

(Lesch-Nyhan syndrome) 

417839 133 Hs.82712 fragile X mental retardation, autosomal ho- 

molog 1 

41 7900 1 33 Hs.82906 CDC20 (cell division cycle 20, S. cerevisiae, 

homotog) 

417924 133 Hs.82932 cyclin D1 (PRAD1: parathyroid adenomatosis 

D 

41 81 27 1 33 Hs.83532 membrane cofactor protein (CD46, tro- 

phoblast-lymphocyte cross-reactive antigen) 

418321 133 Hs.84087 KIAA01 43 protein 

418504 133 Hs.85335 Homo sapiens mRNA; cDNA 

DKFZp564D1462 (from clone 
DKF2p564D1462) 

41 8629 1 33 Hs.86859 growth factor receptor-bound protein 7 
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419602 
419847 
420079 
420116 
420307 
420613 
420732 
421026 
421075 
421101 
421186 
421311 
421475 
421505 
421595 
421628 
421649 
421733 
421782 
421989 
422043 
422068 
422506 
422913 
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133 Hs.91521 

133 Hs. 184544 

133 Hs.94896 

133 Hs.95231 

133 Hs.66219 

133 Hs.406637 

133 Hs.367762 

133 Hs.101067 

133 Hs.101474 

133 Hs.101840 

133 Hs.270563 

133 Hs.283609 

133 Hs.1 04640 

133 Hs.285641 

133 Hs.301685 

133 Hs.106210 

133 Hs.106415 

133 Hs.1420 

133 Hs.108258 

133 Hs.1 10457 

133 Hs.1 10953 

133 Hs.104520 

133 Hs.300741 

133 Hs.121599 



hypothetical protein 

Homo sapiens, clone IMAGE:3355383, 
mRNA, partial cds 

PTD011 protein 
FH1/FH2 domain-containing protein 

ESTs 



ESTs, Weakly similar to A47582 B-cell growth 
factor precursor [Ksapiens] 

ESTs 



GCN5 (general control of amino-acid synthe- 
sis, yeast, homoIog)-like 2 

KIAA0807 protein 



major histocompatibility complex, class Wike 

sequence 

ESTs, Moderately similar to T12512 hypo- 
thetical protein DKFZp434G232.1 [H.saplens] 

hypothetical protein PRO2032 



HIV-1 inducer of short transcripts binding 
protein; lymphoma related factor 

KIAA1111 protein 



KIAA0620 protein 
hypothetical protein FU10813 



peroxisome proliferative activated receptor, 

delta 

fibroblast growth factor receptor 3 (achondro- 
plasia, thanatophoric dwarfism) 

actin binding protein; macrophin (microfila- 
ment and actln filament cross-linker protein) 

Wolf-Hirschhorn syndrome candidate 1 



retinoic acid induced 1 



Homo sapiens cDNA FU13694 fis, clone 
PLACE2000115 

sorcin 



CGI-18 protein 
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308 EOS Hu03 422929 133 Hs.94011 

309 EOS Hu03 422959 133 Hs.349256 

310 EOS Hu03 423138 133 

311 EOS Hu03 423185 133 Hs.380062 

312 EOS Hu03 423599 133 Hs.31731 

313 EOS Hu03 423810 133 Hs.132955 

314 EOS Hu03 423960 133 Hs.136309 

315 EOS Hu03 424244 133 Hs.143601 

316 EOS Hu03 424415 133 Hs.146580 

317 EOS Hu03 424909 133 Hs. 153752 

318 EOS Hu03 424959 133 Hs.1 53937 

319 EOS Hu03 425093 133 Hs.154525 

320 EOS Hu03 425097 133 Hs.154545 

321 EOSHu03 425205 133 Hs.155106 

322 EOS Hu03 425221 133 Hs.155188 

323 EOS Hu03 425243 133 Hs. 155291 

324 EOS Hu03 425380 133 Hs.32148 

325 EOS Hu03 426028 133 Hs.1 72028 

326 EOS Hu03 426125 133 Hs.166994 

i 

327 EOS Hu03 426177 133 Hs.167700 

328 EOS Hu03 426252 133 Hs.28917 

329 EOS Hu03 426468 133 Hs.117558 

330 EOS Hu03 426469 133 Hs.363039 

331 EOS Hu03 426508 133 Hs.170171 

332 EOSHu03 426682 133 Hs.2056 
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ESTs, Weakly similar to MGB4_HUMAN 
MELANOMA-ASSOCIATED ANTIGEN B4 

[H.sapiens] 

paired immunoglobulin-Uke receptor beta 



gb:EST385571 MAGE resequences, MAGM 
Homo sapiens cDNA, mRNA sequence 

ornithine decarboxylase antlzyme 1 



peroxiredoxln 5 



BCL2/adenovirus E1B 19kD-interacting pro- 
tein 3-like 

SH3-containing protein SH3GLB1 



hypothetical protein hCLA-lso 
enoiase 2, (gamma, neuronal) 
cell division cycle 25B 
activated p21cdc42Hs kinase 
KIAA1076 protein 



POZ domain containing guanine nucleotide 
exchange factor(GEF)1 

receptor (calcitonin) activity modifying protein 

2 

TATA box binding protein (TBP)-assocIated 
factor, RNA polymerase II, F, 55kD 

KIAA0005 gene product 



AD-015 protein 

a disintegrin and metalloproteinase domain 10 

(ADAM 10) 

FAT tumor suppressor (Drosophila) homolog 



Homo sapiens cDNA FU10174 fis, clone 
HEMBA1 003959 

ESTs 



ESTs 



methylmalonate-semialdehyde dehydro- 
genase 

glutamate-ammonia ligase (glutamine syn- 
thase) 
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UDP glycosyltransferase 1 family, polypeptide pro- 
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343 EOS Hu03 428318 

344 EOS Hu03 428712 

345 EOS Hu03 428901 

346 EOS Hu03 429124 

347 EOS Hu03 429187 

348 EOSHu03 429311 

349 EOS Hu03 429561 

350 EOS Hu03 429802 

351 EOS Hu03 429953 

352 EOS Hu03 430604 

353 EOS Hu03 430677 

354 EOS Hu03 430746 

355 EOS Hu03 431604 

356 EOS Hu03 431842 
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133 Hs.303154 

133 Hs.1 73091 

133 Hs.356512 

133 Hs.123253 

133 Hs.284232 

133 Hs.180479 

133 Hs.180655 

133 Hs.1 81 369 

133 Hs.300855 

133 Hs.1 83435 

133 Hs.356190 

133 Hs.190452 

133 Hs.146668 

133 Hs.1 96914 

133 Hs.1 63872 

133 Hs.1 98998 

133 Hs.250646 

133 Hs.5367 

133 Hs.226581 

133 Hs.247309 

133 Hs.359784 

133 Hs.406256 

133 Hs.264190 

133 Hs.271473 



A9 

popeye protein 3 
ubiquitin-ltke 3 
ubiquitin carrier protein 
hypothetical protein FU22009 



tumor necrosis factor receptor superfamily, 
member 12 (translocating chain-association 
membrane protein) 
hypothetical protein FU20116 



serine/threonine kinase 12 



ubiquitin fusion degradation 1-Iike 



KIAA0977 protein 



NMJ)04545:Homo sapiens NADH dehydro- 
genase (ubiquinone) 1 beta subcomplex, 1 
(7kD t MNLL) (NDUFB1), mRNA. 

ubiquitin B 



KIAA0365 gene product 
KIAA1253 protein 
minor histocompatibility antigen HA-1 



ESTs, Weakly simHar to S65657 a!pha-1C- 
adrenergic receptor splice form 2 [H.sapiens] 

conserved helix-ioop-helix ubiquitous kinase 



bacutoviral IAP repeat-containing 6 



ESTs, Weakly similar to I38022 hypothetical 
protein [Ksapiens] 

COX15 (yeast) homoiog, cytochrome c oxi- 
dase assembly protein 

succlnate-CoA llgase, GDP-forming, beta 

subunit 

desmoglein 2 



ESTs 



vacuolar protein sorting 35 (yeast homoiog) 



epithelial protein up-regulated in carcinoma, 
membrane associated protein 17 
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133 Hs.293003 

133 Hs.49007 

133 Hs.1 79647 

133 Hs.112160 
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133 Hs.79187 

133 Hs.106124 

133 Hs.273397 

133 Hs.4310 

133 Hs.65588 

133 Hs.1 17864 

133 Hs.6361 

133 Hs.46366 

133 Hs.77542 

133 Hs.330716 

133 Hs.97871 

133 Hs.385719 

133 Hs.15670 

133 Hs.129037 



ADP-ribosyltransferase (NAD; poly (ADP- 
ribose) po1ymerase)-like 3 

ESTs 



neuroglobln 



NCK-associated protein 1 



catpastatin 



ESTs, Weakly similar to PC4259 ferritin asso- 
ciated protein [H.sapiens] 



hypothetical protein 



Homo sapiens cDNA FU12195 fis, clone 
MAMMA1 000865 

Homo sapiens DNA helicase homolog (PIF1) 
mRNA, partial cds 

x 003 protein 



ESTs 
ESTs 

KIAA0710 gene product 
eukaryotic translation initiation factor 1 A 
DAZ associated protein 1 
ESTs 



mitogen-activated protein kinase kinase 1 
interacting protein 1 

KIAA0948 protein 



ESTs 



Homo 
Homo 



cDNA FU14368 fis, clone 
HEMBA1001122 

, done IMAGE:3845253, 
mRNA, partial cds 

ESTs 



ESTs 
ESTs 
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133 Hs.30738 

133 Hs.6451 

133 Hs.75216 

133 Hs.375195 

133 Hs.350547 

133 Hs.334437 

133 Hs.6856 

133 Hs.1 58549 

133 Hs.317714 

133 Hs.20950 

133 Hs.132545 

133 Hs.8148 

133 Hs.8375 

133 Hs.348514 

133 Hs.398102 

133 Hs.9670 

133 Hs.1 15472 

133 Hs.380932 

133 Hs.351142 

133 Hs.10882 

133 Hs.11441 

133 Hs.250848 

133 Hs.288649 

133 Hs.1 82099 



ESTs 



PRO0659 protein 



Homo sapiens cDNA FU13713 fis, clone 
PLACE2000398, moderately similar to LAR 
PROTEIN PRECURSOR (LEUKOCYTE 
ANTIGEN RELATED) (EC 3.1.3.48) 

ESTs 



nuclear receptor co-repressor/HDAC3 com- 
plex subunit 

hypothetical protein MGC4248 



ash2 (absent, small, or homeotic, Drosophila, 

homoiogHike 

ESTs, Weakly similar to T2D3_HUMAN 
TRANSCRIPTION INITIATION FACTOR 
TFIID 135 KDA SUBUNIT [H.sapiens] 
pallid (mouse) homolog, pallidin 



phospholyslne phosphohistidine inorganic 
pyrophosphate phosphatase 

ESTs 



selenoprotein T 



TNF receptor-associated factor 4 



ESTs, Moderately similar to 2109260A B cell 
growth factor [H.sapiens] 

Homo sapiens clone FLB3442 PRO0872 
mRNA, complete cds 

hypothetical protein FLJ 10948 



ESTs, Weakly similar to 2004399A chromo- 
somal protein [H.sapiens] 

CHMP1.5 protein 



ESTs 

HMG-box containing protein 1 
chromosome 1 open reading frame 8 
hypothetical protein FU 14761 
hypothetical protein MGC3077 
ESTs 
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1 33 Hs.1 3351 LanC (bacterial lantibiotic synthetase compo- 
nent C)-like1 

133 Hs.15303 KIAA0349 protein 



133 Hs.82845 Homo sapiens cDNA: FU21930 fls, clone 
HEP04301, highly similar to HSU90916 Hu- 
man clone 23815 mRNA sequence 
1 33 Hs.236894 ESTs, Highly similar to S02392 alpha-2- 

macroglobulin receptor precursor [Ksapiens] 

133 Hs.18457 hypothetical protein FU20315 



133 Hs.108923 RAB38, member RAS oncogene family 

1 33 Hs.21 356 hypothetical protein DKFZp762K201 5 

1 33 Hs.1 78470 hypothetical protein FU22662 



133 Hs.267749 Human DNA sequence from clone 366N23 on 
chromosome 6q27. Contains two genes simi- 
lar to consecutive parts of the C. elegans 
UNC-93 (protein 1, C46F11.1) gene, a 
KIAA0173 and Tubulin-Tyrosine Ugase LIKE 
gene, a Mitotic Feedback Control Protein 

MADP2H 

1 33 Hs.221 42 cytochrome b5 reductase b5R.2 



1 33 Hs.2341 2 hypothetical protein FU201 60 

1 33 Hs. 1 1 2860 zinc finger protein 258 

1 33 Hs.25625 hypothetical protein FU1 1 323 

1 33 Hs.35254 hypothetical protein FLB6421 



1 33 Hs.60659 ESTs, Weakly similar to T46471 hypothetical 
protein DKF2p434L0130.1 [H.sapiens] 

133 Hs.57655 ESTs 



133 Hs.27192 hypothetical protein dJ1057B20.2 



133 Hs.21 1046 ESTs 



1 33 Hs.279766 kinesin family member 4A 

133 Hs.28285 patched related protein translocated In renal 

cancer 

1 33 - gb:RC-BT068-1 30399-068 BT068 Homo 

sapiens cDNA, mRNA sequence 

133 Hs.63368 ESTs. Weakly similar to TRHYJHUMAN 

TRICHOHYALI [H.sapiens] 

133 Hs.172816 neuregulin 1 
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133 Hs.377915 mannosidase, alpha, class 2A, member 1 



133 gb:RC2-ST01 58-091 099-01 1-d05 ST0158 

Homo sapiens cDNA, mRNA sequence 

133 Hs.399939 gb:nc39d05.M NCIJDGAP_Pr2 Homo sapiens 

cDNA clone, mRNA sequence 

133 Hs.195471 Human cosmid CRI-JC2015 at D10S289 In 

10sp13 

1 33 Hs.1 03267 hypothetical protein FU22548 similar to gene 

trap PAT 12 

133 Hs.152925 KIAA1268 protein 



133 Hs.65450 reticulon4 



133 Hs.96264 alpha thalassemia/mental retardation syn- 
drome X-! Inked (RAD54 (S. cerevisiae) ho- 

molog) 

1 33 Hs.1 1 1 862 KIAA0590 gene product 



1 33 Hs.1 578 baculoviral IAP repeat-containing 5 (survivin) 

133 Hs.351597 ESTs 

133 Hs.181461 ariadne homolog, ubiquitln-conjugating en- 
zyme E2 binding protein, 1 (Drosophila) 

1 33 Hs.5548 F-box and leucine-rlch repeat protein 5 

133 Hs.11923 hypothetical protein DJ167A19.1 

133 Hs.334826 splicing factor 3b, subunit 1, 155kDa 



133 Hs.30340 KIAA1165: likely ortholog of mouse Nedd4 

WW domain-binding protein 5A 

133 Hs.268016 ESTs 



1 33 Hs.28959 cDNA FU3651 3 fis, clone TRACH2001 523 



133 Hs.359682 caipastatln 
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168 Hs.170328 NM_001910; cathepsin E Isoform a prepropro- CIS 
teln NIVM 48964; cathepsin E isoform b pre- 

proproteln 

168 Hs.173381 NM_01 9894; transmembrane protease, serine CIS 
4 isoform 1 NM_183247; transmembrane 
protease, serine 4 isoform 2 
1 68 Hs. 1 59557 NM 000228; lamlnin subunit beta 3 precursor CIS 
168 Hs.156346 NM_030570; uroplakin 3B isoform a CIS 

NM_1 82683; uroplakin 3B isoform c 
NM_1 82684; uroplakin 3B isoform b 
168 Hs.25035 NM_005547; Involucrin CIS 

168 Hs.443811 NM_004692; NM_032727; intemexin neu- CIS 

ronal intermediate filament protein, alpha 
168 Hs.1181 10 NM_016233; peptidylarginine deiminase type CIS 

III 

168 Hs.406475 NMJ)14417; BCL2 binding component 3 CIS 
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NM_020142; NADH: ubiquinone oxidoreduo CIS 
tase MLRQ subunlt homolog 
NM_018058; cartilage acidic protein 1 CIS 
NM_000497; cytochrome P450, subfamily XIB CIS 
(steroid 11-beta-hydroxylase), polypeptide 1 

precursor 
NM_007193; annexln A10 CIS 
NM_001958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NM_005581 ; Lutheran blood group (Auberger CIS 

b antigen included) 
NM_005581 ; Lutheran blood group (Auberger CIS 
b antigen included) 
NM_030570; uroplakin 3B isoform a CIS 
NM_1 82683; uroplakin 3B Isoform c 
NM_1 82684; uroplakin 3B isoform b 
NM J)00300; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 
NM_007193; annexin A10 CIS 
NM_0071 44; ring finger protein 110 CIS 
NM J)1 441 7; BCL2 binding component 3 CIS 
NMJJ01 442; fatty acid binding protein 4, CIS 

adipocyte 

NM_01 7689; hypothetical protein FU201 51 CIS 
NM_007144; ring finger protein 1 10 CIS 
NM_004692; NM_032727; intemexin neu- CIS 
ronal intermediate filament protein, alpha 
NM J)01 248; ectonucleoside triphosphate CIS 
diphosphohydrolase 3 
NM_01 7689; hypothetical protein FLJ201 51 CIS 
NM_001 958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NMJ>1 6233; peptidylarginine deiminase type CIS 

II! 

NM_000445; plectin 1 , Intermediate filament CIS 
binding protein 500kDa 
NM_00021 3; integrin, beta 4 CIS 
NM_019894; transmembrane protease, serine CIS 
4 isoform 1 NM_1 83247; transmembrane 
protease, serine 4 Isoform 2 
NMJXJ0213; integrin, beta 4 CIS 
NM_0021 45; homeo box B2 CIS 
NM_006760; uroplakin 2 CIS 
NMJ)01910; cathepsin E isoform a prepropro- CIS 
tein NM_148964; cathepsin E isoform b pre- 

proprotein 
NMJ)06942; SRY-box 15 CIS 
NM J)01 248; ectonucleoside triphosphate CIS 
diphosphohydrolase 3 
NM_005522; homeobox A1 protein isoform a CIS 
NM_1 53620; homeobox A1 protein isoform b 

NMJ303282; troponin I, skeletal, fast CIS 
NM_015162;lipidosin CIS 
NMJ)15162; lipldosin CIS 
NM_030570; uroplakin 3B isoform a CIS 
NM_1 82683; uroplakin 3B isoform c 
NM_1 82684; uroplakin 3B Isoform b 

NM_00021 3; integrin, beta 4 CIS 
NM_006760; uroplakin 2 CIS 
NMJ31 5162; lipldosin CIS 
NMJ)00228; iaminin subunit beta 3 precursor CIS 
NM_0071 44; ring finger protein 110 CIS 
NM_000228; Iaminin subunit beta 3 precursor CIS 
NM_001 248; ectonucleoside triphosphate CIS 
diphosphohydrolase 3 
NM_0071 93; annexin A1 0 CIS 
NMJ)17689; hypothetical protein FU20151 CIS 
NMJJ20142; NADH:ubiquinone oxidoreduo- CIS 
tase MLRQ subunit homolog 
NM_001 958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NM_000300; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 
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NNL.001910; cathepsin E isoform a prepropro- CIS 
teln NM_1 48964; cathepsin E isoform b pre- 

proprotein 

NM__007144; ring finger protein 1 10 CIS 
NM_01441 7; BCL2 binding component 3 CIS 
NM_005581 ; Lutheran blood group (Auberger CIS 
b antigen included) 
NM_003282; troponin I, skeletal, fast CIS 
NM_020142; NADH:ubiquInone oxldoreduc- CIS 

tase MLRQ subunit homolog 
NM_000445; plectln 1, Intennediate filament CIS 
binding protein 500kDa 
NM_005547; involucrin CIS 
NM__000299; plakophilin 1 CIS 
NMJ)02145; homeo box B2 CIS 
NM_000497; cytochrome P450, subfamily XIB CIS 
(steroid 11-beta-hydroxylase), polypeptide 1 

precursor 
NM_007193; annexin A10 CIS 
NM_005522; homeobox A1 protein isoform a CIS 
NM_1 53620; homeobox A1 protein isoform b 

NM__006760; uroplakin 2 CIS 
NMJD05547; involucrin CIS 
NMJ)00497; cytochrome P450. subfamily XIB CIS 
(steroid 11-beta-hydroxylase), polypeptide 1 

precursor 

NM_005522; homeobox A1 protein isoform a CIS 
NM_1 53620; homeobox A1 protein isoform b 

NM_002145; homeo box B2 CIS 
NM_001442; fatty acid binding protein 4, CIS 

adipocyte 
NM_006942; SRY-box 15 CIS 
NM_006942; SRY-box 15 CIS 
NM_01 6233; peptidylarginine deiminase type CIS 

III 

NM_018058; cartilage acidic protein 1 CIS 
NM_001248; ectonucleoside triphosphate CIS 
diphosphohydrolase 3 
NMJJ06760; uroplakin 2 CIS 
NMJ)18058; cartilage acidic protein 1 CIS 
NM_005547; involucrin CIS 
NMJ)00445; plectin 1 , intermediate filament CIS 
binding protein 500kDa 
NMJJ03282; troponin I, skeletal, fast CIS 
NM_001 91 0; cathepsin E isoform a prepropro- CIS 
tein NM_1 48964; cathepsin E isoform b pre- 

proprotein 

NM_000228; laminin subunit beta 3 precursor CIS 
NM_000299; plakophilin 1 CIS 
NM_020142; NADHrublquinone oxldoreduc- CIS 
tase MLRQ subunit homolog 
NM_001442; fatty acid binding protein 4, CIS 

adipocyte 

NM_000445; plectin 1 , Intermediate filament CIS 
binding protein 500kDa 
NM_000300; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 
NM_019894; transmembrane protease, serine CIS 
4 Isoform 1 NM_1 83247; transmembrane 
protease, serine 4 isoform 2 
NM_004692; NM_032727; intemexin neu- CIS 
ronal intermediate filament protein, alpha 

NMJ530570; uroplakin 3B isoform a CIS 
NM_1 82683; uroplakin 3B isoform c 
NM_1 82684; uroplakin 3B isoform b 
NM_001442; fatty acid binding protein 4, CIS 

adipocyte 

NM_01 6233; peptidylarginine deiminase type CIS 

Hi 

NM_01 8058; cartilage acidic protein 1 CIS 
NMJ)00300; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 
NM_Q00299; plakophilin 1 CIS 
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NM_Q00299; plakophilln 1 CIS 
NM_001 958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NM_005625; syndecan binding protein CIS 
(syntenin) 

NM_002719; gamma isoform of regulatory CIS 
subunit B56, protein phosphatase 2A Isoform 
a NM_1 78586; gamma isoform of regulatory 
subunit B56, protein phosphatase 2A isoform 
b NNM 78587; gamma isoform of regulatory 
subunit B56, protein phosphatase 2A isoform 
c NNM 78588; gamma isoform of regulatory 
subunit B56 t protein phosphatase 2A isoform 

d 

NM.001560; interieukin 13 receptor, alpha 1 CIS 

precursor 

NM_001 166; baculoviral IAP repeat- CIS 
containing protein 2 
NM_007373; soc-2 suppressor of clear ho- CIS 

molog 

NM_003563; speckle-type POZ protein CIS 
NM_01 21 61 ; F-box and leu ci ne-rich repeat CIS 
protein 5 Isoform 1 NMJ)33535; F-box and 
leucine-rich repeat protein 5 isoform 2 
NM J)1 571 6; misshapen/NlK-related kinase CIS 
isoform 1 NM_1 53827; mlsshapen/NIK-related 
kinase isoform 3 NM_1 70663; mis- 
shapen/NlK-related kinase isoform 2 
NM.003925; methyl-CpG binding domain CIS 

protein 4 

NM__012164; F-box and WD-40 domain pro- CIS 

tein 2 

NM_015125; caplcua homolog CIS 

CIS 

NM_01 5076; cyclin-dependent kinase (CDC2- CIS 

like) 11 

NMJ)1 8957; SH3-doma!n binding protein 1 CIS 
NM_018695; erbb2 interacting protein CIS 
NM_012097; ADP-ribosyiation factor-like 5 CIS 
isoform 1 NM_1 77985; ADP-ribosylation 
factor-like 5 isoform 2 



The expression level of at least one gene in the sample is determined, wherein at least one 
of said genes is selected from the genes of Table A. The samples according to the present 
invention may be any tissue sample or body fluid sample, it is however often preferred to 
5 conduct the methods according to the invention on epithelial tissue, such as epithelial tissue 
from the bladder. In particular the epithelial tissue may be mucosa. In another embodiment 
the sample is a urine sample comprising the tissue cells. 



The sample may be obtained by any suitable manner known to the man skilled in the art, 
10 such as a biopsy of the tissue, or a superficial sample scraped from the tissue. The sample 
may be prepared by forming a cell suspension made from the tissue, or by obtaining an ex- 
tract from the tissue. 



In one embodiment it is preferred that the sample comprises substantially only cells from 
1 5 said tissue, such as substantially only cells from mucosa of the bladder. 
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The methods according to the invention may be used for determining any biological condi- 
tion, wherein said condition leads to a change in the expression of at least one gene, and 
preferably a change in a variety of genes. 



5 Thus, the biological condition may be any malignant or premalignant condition, in particular 
in bladder, such as a tumor or an adenocarcinoma, a carcinoma, a teratoma, a sarcoma, 
and/or a lymphoma, and/or carcinoma-in-situ, and/or dysplasia-in-situ. 

The expression level may be determined as single gene approaches, i.e. wherein the deter- 
10 mination of expression from one or two or a few genes is conducted. It is however preferred 
that information is obtained from several genes, so that an expression pattern is obtained. 

In a preferred embodiment expression from at least one gene from a first group is deter- 
mined, said first gene group representing genes being expressed at a higher level in one 
1 5 type of tissue, i.e. tissue in one stage or one risk group, in combination with determination of 
expression of at least one gene from a second group, said second group representing genes 
being expressed at a higher level in tissue from another stage or from another risk group. 
Thereby the validity of the prediction increases, since expression levels from genes from 
more than one group are determined. 

20 

However, determination of the expression of a single gene whether belonging to the first 
group or second group is also within the scope of the present invention. In this case it is pre- 
ferred that the single gene is selected among genes having a high change in expression 
level from normal cells to biological condition cells. 

25 

Another approach is determination of an expression pattern from a variety of genes, wherein 
the determination of the biological condition in the tissue relies on information from a variety 
of gene expression, i.e. rather on the combination of expressed genes than on the informa- 
tion from single genes. 

30 

The following data presented herein relates to bladder tumors, and therefore the description 
has focused on the gene expression level as one way of identifying genes that lose or gain 
function in cancer tissue. Genes showing a remarkable downregulation (or complete loss) or 
upregulation (gene expression gained de novo) of the expression level - measured as the 
35 mRNA transcript, during the malignant progression in bladder from normal mucosa through 
Ta superficial tumors, and Carcinoa in situ (CIS) to T1, slightly invasive tumors, to T2, T3 
and T4 which have spread to muscle or even further into lymph nodes or other organs are 
within the scope of the invention, as well as genes gaining importance during the differentia- 
tion from normal towards malignancy. 
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The present invention relates to a variety of genes identified either by an EST identification 
number and/or by a gene identification number. Both type of identification numbers relates to 
identification numbers of UniGene database, NCBI, build 18. 

5 

The various genes have been identified using Affymetrix arrays of the following product 
numbers: 

HUGeneFL (sold in 2000-2002) 
EOS Hu03 (customized Affymetric array) 
10 U1 33A (product #900367 sold in 2003) 

Stage of a bladder tumor indicates how deep the tumor has penetrated. Superficial tumors 
are termed Ta, and Carcinoma in situ (CIS), and T1, T2, T3 and T4 are used to describe 
increasing degrees of penetration into the muscle. The grade of a bladder tumor is 

15 expressed on a scale of l-IV (1-4) according to Bergkvist, A.; Ijungquist, A.; Moberger, 
B."Classification of bladder tumours basedf on the cellular pattern. Preliminary report of a 
clinical-pathological study of 300 cases with a minimum follow-up of eight years", Acta Chir 
Scand., 1965, 130(4):371-8). The grade reflects the cytological appearance of the cells. 
Grade I cells are almost normal. Grade II cells are slightly deviant. Grade III cells are clearly 

20 abnormal. And Grade IV cells are highly abnormal. A special form of bladder malignancy is 
carcinoma-in-situ or dyplasia-in-situ in which the altered cells are located in-situ. 

It is important to predict the prognosis of a cancer disease, as superficial tumors may require 
a less intensive treatment than invasive tumors. According to the invention the expression 

25 level of genes may be used to identify genes whose expression can be used to identify a 
certain stage and/or the prognosis of the disease. These "Classifiers" are divided into those 
which can be used to identify Ta, Carcinoma in situ (CIS), T1, and T2 stages as well as 
those identifying risk of recurrence or progression. In one aspect of the invention measuring 
the transcript level of one or more of these genes may lead to a classifier that can add sup- 

30 plementary information to the information obtained from the pathological classification. For 
example gene expression levels that signify a T2 stage will be unfavourable to detect in a Ta 
tumor, as they may signal that the Ta tumor has the potential to become a T2 tumor. The 
opposite is probably also true, that an expression level that signify Ta will be favorable to 
have in a T2 tumor. In that way independent information may be obtained from pathological 

35 classification and a classification based on gene expression levels is made. 

In the present context a standard expression level is the level of expression of a gene in a 
standard situation, such as a standard Ta tumor or a standard T2 tumor. For use in the pre- 
sent invention standard expression levels is determined for each stage as well as for each 
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group of progression, recurrence, and other prognostic indices. It is then possible to com- 
pare the result of a determination of the expression level from a gene of a given biological 
condition with a standard for each stage, progression, recurrence and other indices to obtain 
a classification of the biological condition. 

5 

Furthermore, in the present context a reference patterne refers to the pattern of expression 
levels seen in standard situations as discussed above, and reference patterns may be used 
as discussed above for standard expression levels. 

10 It is known from the histopathologic^ classification of bladder tumors that some information 
is obtained from merely classifying into stage and grade of tumor. Accordingly, in one as- 
pect, the invention relates to a method of predicting the prognosis of the biological condition 
by determining the stage of the biological condition, by determining an expression level of at 
least one gene, wherein said gene is selected from the group of genes consisting of gene No 

15 1 to gene No. 562. In this aspect information about the stage reveils directly information 
about the prognosis as well. An example hereof is when a bladder tumor is classified as for 
example stage T2, then the prognosis for the bladder tumor is obtained directly from the 
prognosis related generally to stage T2 tumors. In a preferred embodiment the genes for 
predicting the prognosis by establishing the stage of the tumor may be selected from gene 

20 selected from the group of genes consisting of gene No. 1 to gene No. 188. More preferably 
the genes for predicting the prognosis by establishing the stage of the tumor may be se- 
lected from gene selected from the group of genes consisting of gene Nos. 18, 39, 40, 55, 
58, 79, 86, 87, 88, 91, 93, 103, 105, 106, 121, 123, 125, 126, 136, 137, 140, 149, 156, 158, 
161, 165, 166, 167, 175, 184, 187, 188. 

25 

It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
30 least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 1 5 genes, such as the expression level of at least 20 genes, 
such as the expression levels of at least 25 genes, such as the expression levels of at least 
30 genes, such as the expression level of 32 genes. 

35 

As discussed above, in relation to bladder cancer the stages of a bladder tumor are selected 
from bladder cancer stages Ta, Carcinoma in situ, T1, T2, T3 and T4. In an embodiment the 
determination of a stage comprises 
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assaying at least the expression of Ta stage gene from a Ta stage gene group, at least one 
expression of a CIS gene, at least one expression of T1 stage gene from a T1 stage gene 
group, at least the expression of T2 stage gene from a T2 stage gene group, and more pref- 
erably assaying at least the expression of Ta stage gene from a Ta stage gene group, at 
5 least one expression of a CIS gene, at least one expression of T1 stage gene from a T1 
stage gene group, at least the expression of T2 stage gene from a T2 stage gene group, at 
least the expression of T3 stage gene from a T3 stage gene group, at least the expression of 
T4 stage gene from a T4 stage gene group wherein at least one gene from each gene group 
is expressed in a significantly different amount in that stage than in one of the other stages. 



10 



Preferably, the genes selected may be a gene from each gene group being expressed in a 
significantly higher amount in that stage than in one of the other stages as compared to nor- 
mal controls, see for example Table B below. 



15 The genes selected may be a gene from each gene group being expressed in a significantly 
lower amount in that stage than in one of the other stages. 

In another embodiment the present invention relates to a method of predicting the prognosis 
of a biological condition by obtaining information in addition to the stage classification as 
20 such. As described above, by determining gene expression levels that signify a T2 stage in a 
tumor otherwise classified as a Ta tumor, the expression levels signal that the Ta tumor has 
the potential to become a T2 tumor. The opposite is also true, that an expression level that 
signify Ta will be favorable to have in a T2 tumor. In the present invention the inventors have 
shown that some genes are relevant for obtaining this additional information. 

25 

Also, in one embodiment the present invention relates to a further method of predicting the 
prognosis of a biological condition by obtaining information in addition to the stage classifica- 
tion as such. Determination of squamous metaplasia in a tumor, in particular in a T2 stage 
tumor, is indicative of risk of progression. In particular the genes may be selected from gene 
30 selected from the group of genes consisting of gene No. 215 to gene No. 232, see also table 
H. 



It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
35 the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 15 genes, such as the expression level of 18 genes. 
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In another embodiment the invention relates to genes bearing information of recurrence of 
the biological condition as such. In particular the genes may be selected from gene selected 
from the group of genes consisting of gene No. 189 to gene No. 214. It is preferred to deter- 
5 mine a first expression level of at least one gene from a first gene group, wherein the gene 
from the first gene group is selected from the group of genes wherein expression is in- 
creased in case of recurrence, genes No. 189 to gene No. 199 (recurrence genes), and de- 
termined a second expression level of at least one gene from a second gene group, wherein 
the second gene group is selected from the group of genes wherein expression is increased 
10 in case of no recurrence, genes No. 200 to No. 214 (non-recurrence genes), and correlate 
the first expression level to a standard expression level for progressors, and/or the second 
expression level to a standard expression level for non-progressors to predict the prognosis 
of the biological condition in the animal tissue, see also table C. 

15 It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 

20 sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 15 genes, such as the expression level of at least 20 genes, 
such as the expression level of at least 25 genes, such as the expression level of 26 genes. 

Furthermore, in another embodiment the invention relates to genes bearing information of 
25 progression as such. In particular the genes may be selected from the group of genes of 
table D, more preferably selected from the group of genes consisting of gene No. 233 to 
gene No. 446. More preferably the genes may be selected from the group of genes Nos. 
255, 273, 279, 280, 281, 282 , 287, 295, 300, 311, 317, 320, 333, 346, 347, 349, 352, 364, 
365, 373, 383, 386, 390, 394, 401 ,407, 414, 417, 426, 427, 428, 433, 434, 435, 436, 437, 
30 438, 439, 440, 441 , 442, 443, 444, 445, 446, see table E. 

It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
35 genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 15 genes, such as the expression level of at least 20 genes, 
such as the expression levels of at least 25 genes, such as the expression levels of at least 
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30 genes, such as the expression level of at least 35 genes, such as the expression level of 
at least 40 genes, such as the expression level of 45 genes. 



Furthermore, it is within the scope of the invention to predict the prognosis of a biological 
5 condition in animal tissue by determining the expression level of at least two genes, by 



determining a first expression level of at least one gene from a first gene group, wherein 
the gene from the first gene group is selected from the group of gene Nos. 237, 238, 
239, 240, 241, 242, 243, 245, 246, 247, 248, 250, 253, 254, 257, 258, 260, 263, 264, 
10 265, 267, 270, 271, 272, 278, 283, 284, 287, 288, 290, 291, 292, 294, 297, 298, 300, 

302, 303, 305, 309, 310, 315, 316, 317, 318, 319, 321, 324, 329, 335, 336, 337, 339, 

340, 344, 346, 347, 354, 356, 358, 359, 362, 364, 365, 368, 369, 371, 372, 377, 378, 
379, 380, 381, 382, 383, 384, 388, 391, 393, 395, 396, 397, 399, 402, 403, 404, 409, 
413, 417, 419, 420, 421, 422, 423, 425, 427 ,429, 430, 431, 432, 437, 444 (progressor 

15 genes), and 

determining a second expression level of at least one gene from a second gene group, 
wherein the second gene group is selected from the group of genes Nos. 233, 234, 235, 
236, 244, 249, 251, 252, 255, 256, 259, 261, 262, 266, 268, 269, 273, 274, 275, 276, 
20 277, 279, 280, 281, 282, 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 307, 308, 

311, 312, 313, 314 , 320 , 322, 323, 325, 326, 327, 328 , 330, 331, 332, 333, 334, 338, 

341, 342, 343, 345, 348, 349, 350, 351, 352, 353, 355, 357, 360, 361, 363, 366, 367, 
370, 373, 374, 375, 376, 385, 386, 387, 389, 390, 392, 394, 398, 400, 401, 405, 406, 
407, 408, 410, 411, 412, 414, 415, 416, 418, 424, 426, 428, 433, 434, 435, 436, 438, 

25 439, 440, 441 , 442, 443, 445, 446 (non-progressor genes), and 

correlating the first expression level to a standard expression level for progressors, 
and/or the second expression level to a standard expression level for non-progressors to 
predict the prognosis of the biological condition in the animal tissue. 

30 

In particular the genes of the first group and the second group for predicting the prognosis of 
a Ta stage tumor may be selected from gene selected from the group of progression/non- 
progession genes described above. 



35 In yet another embodiment the present invention offers the possibility to predict the presence 
or absence of Carcinoma in situ in the same organ as the primary biological condition. An 
example hereof is for a Ta bladder tumor to predict, whether the bladder in addition to the Ta 
tumor comprises Carcinoma in situ (CIS). The presence of carcinoma in situ in a bladder 
containing a superficial Ta tumor is a signal that the Ta tumor has the potential of recurrence 
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and invasiveness. Accordingly, by predicting the presence of carcinoma in situ important 
information about the prognosis is obtained. In the present context, genes for predicting the 
presence of carcinoma in situ for a Ta stage tumor may be selected from gene selected from 
the group of genes consisting of gene No. 447 to gene No. 562. More preferably the genes 
5 are selected from the group of genes consisting of gene Nos 447, 448, 449, 450, 451, 452, 
453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 
471, 472, 473, 474, 475, 476, 477, 478, 479, 480 ,481, 482, 483 ,484, 485, 486, 487, 488, 
489, 490, 491, 492, 493, 494, 495 , 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 
507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517 ,518 ,519, 520, 521, 522 ,523, 524, 
10 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 
543, 544, 545, 546, see table F, or from the group of genes consisting of gene Nos. 547, 
548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, see table G. 

It is preferred that the expresison level of more one gene is determined, such as the expres- 
1 5 sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
20 the expression level of at least 15 genes, such as the expression level of at least 20 genes, 
such as the expression levels of at least 25 genes, such as the expression levels of at least 
30 genes, such as the expression level of at least 35 genes, such as the expression level of 
at least 40 genes, such as the expression level of at least 45 genes, such as the expression 
level of at least 50 genes, such as 100 genes. In another embodiment the expression level of 
25 16 genes are determined. 

It is also preferred to determine a first expression level of at least one gene from a first gene 
group, wherein the gene from the first gene group is selected from the group of genes 
wherein expression is increased in case of CIS, genes Nos. 447, 448, 449, 450, 451, 452, 

30 454, 455 ,456, 457, 458, 459, 462, 468, 474, 478, 484, 489, 491, 493, 495, 500, 501, 502, 
504, 505, 506, 507, 508, 509, 510, 611, 512, 513, 514, 518 , 519, 520, 522, 523, 524, 525, 
529, 531, 534, 535, 536, 538, 544, 546, 547, 548, 549, 550, 551, 552, 553, 555, 556, 558, 
559, 561, 562 (CIS genes), and determined a second expression level of at least one gene 
from a second gene group, wherein the second gene group is selected from the group of 

35 genes wherein expression is increased in case of no CIS, genes Nos. 453, 460, 461, 463, 
464, 465, 466, 467, 469, 470, 471, 472, 473, 475, 476, 477, 479, 480, 481, 482, 483, 485, 
486, 487, 488, 490, 492, 494, 496, 497, 498 , 499, 503, 515, 516, 517, 521, 526, 527, 528, 
530 ,532, 533, 537, 539, 540, 541, 542, 543, 545, 554, 557, 560 (non-CIS genes), and corre- 
late the first expression level to a standard expression level for CIS, and/or the second ex- 
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pression level to a standard expression level for non-CIS to predict the prognosis of the bio- 
logical condition in the animal tissue. 

It is preferred when determining the expression level of at least one gene from a first group 
5 and at least one gene from a second group that the expression level of more than one genes 
from each group is determined. Thus, it is preferred that the expresison level of more one 
gene is determined, such as the expression level of at least two genes, such as the expres- 
sion level of at least three genes, such as the expression level of at least four genes, such as 
the expression level of at least five genes, such as the expression level of at least six genes, 
1 0 such as the expression level of at least seven genes, such as the expression level of at least 
eight genes, such as the expression level of at least nine genes, such as the expression 
level of at least ten genes in each group. 

In one embodiment of the invention the stage of the biological condition has been deter- 
15 mined before the prediction of prognosis. The stage may be determined by any suitable 
means such as determined by histological examination of the tissue or by genotyping of the 
tissue, preferably by genotyping of the tissue such as described herein or as described in 
WO 02/02804 incorporated herein by reference. 

20 In another aspect the invention relates to a method of determining the stage of a biological 
condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

25 determining an expression level of at least one gene selected from the group of genes 

consisting of gene No. 1 to gene No. 562, 

correlating the expression level of the assessed genes to at least one standard level of 
expression determining the stage of the condition. 

30 

In particular the expression level of at least one gene selected from the group of genes con- 
sisting of gene Nos. 1-457 and gene Nos. 459-535 and gene Nos. 537-562. 

Specific embodiments of determining the stage is as described above for predicting progno- 
35 sis by determination of stage. 

In a preferred embodiment the expression level of at least two genes is determined by 
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determining the expression of at least a first stage gene from a first stage gene group 
and at least a second stage gene from a second stage gene group, wherein at least one 
of said genes is expressed in said first stage of the condition in a higher amount than in 
said second stage, and the other gene is a expressed in said first stage of the condition 
5 in a lower amount than in said second stage of the condition, and 



correlating the expression level of the assessed genes to a standard level of expression 
determining the stage of the condition. 



10 In general, genes being downregulated for higher stage tumors as well as for progression 
and recurrence may be of importance as predictive markers for the disease as loss of one or 
more of these may signal a poor outcome or an aggressive disease course. Furthermore, 
they may be important targets for therapy as restoring their expression level, e.g. by gene 
therapy, or substitution with those peptide products or small molecules with a similar biologi- 

1 5 cal effect may suppress the malignant growth. 



Genes that are up-regulated (or gained de novo) during the malignant progression of bladder 
cancer from normal tissue through Ta, T1, T2, T3 and T4 is also within the scope of the in- 
vention. These genes are potential oncogenes and may be those genes that create or en- 
hance the malignant growth of the cells. The expression level of these genes may serve as 
predictive markers for the disease course and treatment response, as a high level may sig- 
nal an aggressive disease course, and they may serve as targets for therapy, as blocking 
these genes by e.g. anti-sense therapy, or by biochemical means could inhibit, or slow the 
tumor growth. 

The genes used according to the invention show a sufficient difference in expression from 
one group to another and/or from one stage to another to use the gene as a classifier for the 
group and/or stage. Thus, comparison of an expression pattern to another may score a 
change from expressed to non-expressed, or the reverse. Alternatively, changes in intensity 
of expression may be scored, either increases or decreases. Any significant change can be 
used. Typical changes which are more than 2-fo!d are suitable. Changes which are greater 
than 5-fold are highly suitable. 

The present invention in particular relates to methods using genes wherein at least a two- 
fold change in expression, such as at least a three-fold change, for example at least a four 
fold change, such as at least a five fold change, for example at least a six fold change, such 
as at least a ten fold change, for example at least a fifteen fold change, such as at least a 
twenty fold change is seen between two groups. 
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As described above the invention relates to the use of information of expression levels. In 
one embodiment the expression patterns is obtained, thus, the invention relates to a method 
of determining an expression pattern of a bladder cell sample, comprising: 

5 collecting sample comprising bladder cells and/or expression products from bladder 

cells, 

determining the expression level of at least one gene in the sample, said gene being se- 
lected from the group of genes consisting of gene No. 1 to gene No. 562, and obtaining 
10 an expression pattern of the bladder cell sample. 

The invention preferably include more than one gene in the pattern, according it is preferred 
to include the expression level of at least two genes, such as the expression level of at least 
three genes, such as the expression level of at least four genes, such as the expression 
1 5 level of at least five genes, such as the expression level of more than six genes. 

The expression pattern preferably relates to one or more of the group of genes discussed 
above with respect to prognosis relating to stage, SSC, progression, recurrence and/or CIS. 

20 In order to predict prognosis and/or stages it is preferred to determine an expression pattern 
of a ceil sample preferably independent of the proportion of submucosal, muscle and 
connective tissue cells present. Expression is determined of one or more genes in a sample 
comprising cells, said genes being selected from the same genes as discussed above and 
shown in the tables. 

25 

It is an object of the present invention that characteristic patterns of expression of genes can 
be used to characterize different types of tissue. Thus, for example gene expression patterns 
can be used to characterize stages and grades of bladder tumors. Similarly, gene expression 
patterns can be used to distinguish cells having a bladder origin from other cells. Moreover, 
30 gene expression of cells which routinely contaminate bladder tumor biopsies has been 
identified, and such gene expression can be removed or subtracted from patterns obtained 
from bladder biopsies. Further, the gene expression patterns of single-cell solutions of 
bladder tumor cells have been found to be substantially without interfering expression of 
contaminating muscle, submucosal, and connective tissue cells than biopsy samples. 

35 

The one or more genes exclude genes which are expressed in the submucosal, muscle, and 
connective tissue. A pattern of expression is formed for the sample which is independent of 
the proportion of submucosal, muscle, and connective tissue cells in the sample. 
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In another aspect of the invention a method of determining an expression pattern of a cell 
sample is provided. Expression is determined of one or more genes in a sample comprising 
cells. A first pattern of expression is thereby formed for the sample. Genes which are 
expressed in submucosal, muscle, and connective tissue cells are removed from the first 
5 pattern of expression, forming a second pattern of expression which is independent of the 
proportion of submucosal, muscle, and connective tissue cells in the sample. 

Another embodiment of the invention provides a method for determining an expression 
pattern of a bladder mucosa or bladder cancer cell. Expression is determined of one or more 

10 genes in a sample comprising bladder mucosa or bladder cancer cells; the expression 
determined forms a first pattern of expression. A second pattern of expression which was 
formed using the one or more genes and a sample comprising predominantly submucosal, 
muscle, and connective tissue cells, is subtracted from the first pattern of expression, 
forming a third pattern of expression. The third pattern of expression reflects expression of 

15 the bladder mucosa or bladder cancer cells independent of the proportion of submucosal, 
muscle, and connective tissue cells present in the sample. 

In one embodiment the invention provides a method to predict the prognosis of a bladder 
tumor as described above. A first pattern of expression is determined of one or more genes 
20 in a bladder tumor sample. The first pattern is compared to one or more reference patterns 
of expression determined for bladder tumors at different stages and/or in different groups. 
The reference pattern which shares maximum similarity with the first pattern is identified. The 
stage of the reference pattern with the maximum similarity is assigned to the bladder tumor 
sample. 

25 

Yet another embodiment the invention provides a method to determine the stage of a 
bladder tumor as described above. A first pattern of expression is determined of one or more 
genes In a bladder tumor sample. The first pattern is compared to one or more reference 
patterns of expression determined for bladder tumors at different stages. The reference 
30 pattern which shares maximum similarity with the first pattern is identified. The stage of the 
reference pattern with the maximum similarity is assigned to the bladder tumor sample. 

Since a biopsy of the tissue often contains more tissue material such as connective tissue 
than the tissue to be examined, when the tissue to be examined is epithelial or mucosa, the 
35 invention also relates to methods, wherein the expression pattern of the tissue is 
independent of the amount of connective tissue in the sample. 

Biopsies contain epithelial cells that most often are the targets for the studies, and in addition 
many other cells that contaminate the epithelial cell fraction to a varying extent. The 
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contaminants include histiocytes, endothelial cells, leukocytes, nerve cells, muscle cells etc. 
Micro dissection is the method of choice for DNA examination, but in the case of expression 
studies this procedure is difficult due to RNA degradation during the procedure. The epithelium 
may be gently removed and the expression in the remaining submucosa and underlying 
5 connective tissue (the bladder wall) monitored. Genes expressed at high or low levels in the 
bladder wall should be interrogated when performing expression monitoring of the mucosa and 
tumors. A similar approach could be used for studies of epithelia in other organs. 

In one embodiment of the invention normal mucosa lining the bladder lumen from bladders for 
10 cancer is scraped off. Then biopsies is taken from the denuded submucosa and connective 
tissue, reaching approximately 5 mm into the bladder wall, and immediately disintegrated in 
guanidinium isothiocyanate. Total RNA may be extracted, pooled, and poly(A) + mRNA may be 
prepared from the pool followed by conversion to double-stranded cDNA and in vitro 
transcription into cRNA containing biotin-labeled CTP and UTP. 

15 

Genes that are expressed and genes that are not expressed in bladder wall can both interfere 
with the interpretation of the expression in a biopsy, and should be considered when 
interpreting expression intensities in tumor biopsies, as the bladder wall component of a biopsy 
varies in amount from biopsy to biopsy. 

20 

When having determined the pattern of genes expressed in bladder wall components said 
pattern may be subtracted from a pattern obtained from the sample resulting in a third pattern 
related to the mucosa (epithelial) cells. 

25 In another embodiment of the invention a method is provided for determining an expression 
pattern of a bladder tissue sample independent of the proportion of submucosal, muscle and 
connective tissue cells present. A single-cell suspension of disaggregated bladder tumor 
cells is isolated from a bladder tissue sample comprising bladder tumor cells is isolated from 
a bladder tissue sample comprising bladder cells, submucosal cells, muscle ceils, and 

30 connective tissue cells. A pattern of expression is thus formed for the sample which is 
independent of the proportion of submucosal, muscle, and connective tissue cells in the 
bladder tissue sample. 

Yet another method relates to the elimination of mRNA from bladder wall components before 
35 determining the pattern, e.g. by filtration and/or affinity chromatography to remove mRNA 
related to the bladder wall. 

Working with tumor material requires biopsies or body fluids suspected to comprise relevant 
cells. Working with RNA requires freshly frozen or immediately processed biopsies, or 
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chemical pretreatment of the biopsy. Apart from the cancer tissue, biopsies do inevitably 
contain many different cell types, such as cells present in the blood, connective and muscle 
tissue, endothelium etc. In the case of DNA studies, microdissection or laser capture are 
methods of choice, however the time-dependent degradation of RNA makes it difficult to 
5 perform manipulation of the tissue for more than a few minutes. Furthermore, studies of 
expressed sequences may be difficult on the few cells obtained via microdissection or laser 
capture, as these cells may have an expression pattern that deviates from the predominant 
pattern in a tumor due to large intratumoral heterogeneity. 

10 In the present context high density expression arrays may be used to evaluate the impact of 
bladder wall components in bladder tumor biopsies, and tested preparation of single cell 
solutions as a means of eliminating the contaminants. The results of these evaluations 
permit for the design of methods of evaluating bladder samples without the interfering 
background noise caused by ubiquitous contaminating submucosal, muscle, and connective 

1 5 tissue cells. The evaluating assays of the invention may be of any type. 

While high density expression arrays can be used, other techniques are also contemplated. 
These include other techniques for assaying for specific mRNA species, including RT-PCR 
and Northern Blotting, as well as techniques for assaying for particular protein products, 
20 such as ELISA, Western blotting, and enzyme assays. Gene expression patterns according 
to the present invention are determined by measuring any gene product of a particular gene, 
including mRNA and protein. A pattern may be for one or more genes. 

RNA or protein can be isolated and assayed from a test sample using any techniques known 
25 in the art. They can for example be isolated from a fresh or frozen biopsy, from formalin-fixed 
tissue, from body fluids, such as blood, plasma, serum, urine, or sputum. 

Expression of genes may in general be detected by either detecting mRNA from the cells 
and/or detecting expression products, such as peptides and proteins. 

30 

The detection of mRNA of the invention may be a tool for determining the developmental 
stage of a ceil type which may be definable by its pattern of expression of messenger RNA. 
For example, in particular stages of cells, high levels of ribosomal RNA are found whereas 
relatively low levels of other types of messenger RNAs may be found. Where a pattern is 
35 shown to be characteristic of a stage, said stage may be defined by that particular pattern of 
messenger RNA expression. The mRNA population is a good determinant of a 
developmental stage, and may be correlated with other structural features of the cell. In this 
manner, cells at specific developmental stages will be characterized by the intracellular 
environment, as well as the extracellular environment. The present invention also allows the 
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combination of definitions based in part upon antigens and In part upon mRNA expression. 
In one embodiment, the two may be combined in a single incubation step. A particular 
Incubation condition may be found which is compatible with both hybridization recognition 
and non-hybridization recognition molecules. Thus, e.g. an incubation condition may be 
5 selected which allows both specificity of antibody binding and specificity of nucleic acid 
hybridization. This allows simultaneous performance of both types of interactions on a single 
matrix. Again, where developmental mRNA patterns are correlated with structural features, 
or with probes which are able to hybridize to intracellular mRNA populations, a cell sorter 
may be used to sort specifically those cells having desired mRNA population patterns. 

10 

It is within the general scope of the present invention to provide methods for the detection of 
mRNA. Such methods often involve sample extraction, PCR amplification, nucleic acid 
fragmentation and labeling, extension reactions, and transcription reactions. 

15 The nucleic acid (either genomic DNA or mRNA) may be isolated from the sample according 
to any of a number of methods well known to those of skill in the art. One of skill will 
appreciate that where alterations in the copy number of a gene are to be detected genomic 
DNA is preferably isolated. Conversely, where expression levels of a gene or genes are to 
be detected, preferably RNA (mRNA) is isolated. 

20 

Methods of isolating total mRNA are well known to those of skill in the art. In one 
embodiment, the total nucleic acid is isolated from a given sample using, for example, an 
acid guanidinium-phenol-chloroform extraction method and polyA.sup. and mRNA is isolated 
by oligo dT column chromatography or by using (dT)n magnetic beads (see, e.g., Sambrook 
25 et ah, Molecular Cloning: A Laboratory Manual (2nd ed.), Vols. 1-3, Cold Spring Harbor 
Laboratory, (1989), or Current Protocols in Molecular Biology, F. Ausubel et al. ( ed. Greene 
Publishing and Wiley-intersclence, New York (1987)). 

The sample may be from tissue and/or body fluids, as defined elsewhere herein. Before 
30 analyzing the sample, e.g., on an oligonucleotide array, it will often be desirable to perform 
one or more sample preparation operations upon the sample. Typically, these sample 
preparation operations will include such manipulations as extraction of intracellular material, 
e.g., nucleic acids from whole cell samples, viruses, amplification of nucleic acids, 
fragmentation, transcription, labeling and/or extension reactions. One or more of these 
35 various operations may be readily incorporated into the device of the present invention. 

DNA extraction may be relevant under circumstances where possible mutations in the genes 
are to be determined in addition to the determination of expression of the genes. 
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For those embodiments where whole cells, or other tissue samples are being analyzed, it will 
typically be necessary to extract the nucleic acids from the cells or viruses, prior to 
continuing with the various sample preparation operations. Accordingly, following sample 
5 collection, nucleic acids may be liberated from the collected cells, viral coat etc. into a crude 
extract followed by additional treatments to prepare the sample for subsequent operations, 
such as denaturation of contaminating (DNA binding) proteins, purification, filtration and 
desalting. 



10 Liberation of nucleic acids from the sample cells, and denaturation of DNA binding proteins 
may generally be performed by physical or chemical methods. For example, chemical 
methods generally employ lysing agents to disrupt the cells and extract the nucleic acids 
from the cells, followed by treatment of the extract with chaotropic salts such as guanidinium 
isothiocyanate or urea to denature any contaminating and potentially interfering proteins. 

15 

Alternatively, physical methods may be used to extract the nucleic acids and denature DNA 
binding proteins, such as physical protrusions within microchannels or sharp edged particles 
piercing cell membranes and extract their contents. Combinations of such structures with 
piezoelectric elements for agitation can provide suitable shear forces for lysis. 

20 

More traditional methods of cell extraction may also be used, e.g., employing a channel with 
restricted cross-sectional dimension which causes cell lysis when the sample is passed 
through the channel with sufficient flow pressure. Alternatively, cell extraction and denaturing 
of contaminating proteins may be carried out by applying an alternating electrical current to 
25 the sample. More specifically, the sample of cells is flowed through a microtubular array 
while an alternating electric current is applied across the fluid flow. Subjecting cells to 
ultrasonic agitation, or forcing cells through microgeometry apertures, thereby subjecting the 
cells to high shear stress resulting in rupture are also possible extraction methods. 

30 Following extraction, it will often be desirable to separate the nucleic acids from other 
elements of the crude extract, e.g. denatured proteins, cell membrane particles and salts. 
Removal of particulate matter is generally accomplished by filtration or flocculation. Further, 
where chemical denaturing methods are used, it may be desirable to desalt the sample prior 
to proceeding to the next step. Desalting of the sample and isolation of the nucleic acid may 

35 generally be carried out in a single step, e.g. by binding the nucleic acids to a solid phase 
and washing away the contaminating salts, or performing gel filtration chromatography on 
the sample passing salts through dialysis membranes. Suitable solid supports for nucleic 
acid binding include e.g. diatomaceous earth or silica (i.e., glass wool). Suitable gel 
exclusion media also well known in the art may be readily incorporated into the devices of 
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the present invention and is commercially available from, e.g., Pharmacia and Sigma 
Chemical. 

Alternatively, desalting methods may generally take advantage of the high electrophoretic 
5 mobility and negativity of DNA compared to other elements. Electrophoretic methods may 
also be utilized in the purification of nucleic acids from other cell contaminants and debris. 
Upon application of an appropriate electric field, the nucleic acids present in the sample will 
migrate toward the positive electrode and become trapped on the capture membrane. 
Sample impurities remaining free of the membrane are then washed away by applying an 
10 appropriate fluid flow. Upon reversal of the voltage, the nucleic acids are released from the 
membrane in a substantially purer form. Further, coarse filters may also be overlaid on the 
barriers to avoid any fouling of the barriers by particulate matter, proteins or nucleic acids, 
thereby permitting repeated use. 

15 In a similar aspect, the high electrophoretic mobility of nucleic acids with their negative 
charges, may be utilized to separate nucleic acids from contaminants by utilizing a short 
column of a gel or other appropriate matrices or gels which will slow or retard the flow of 
other contaminants while allowing the faster nucleic acids to pass. 

20 This invention provides nucleic acid affinity matrices that bear a large number of different 
nucleic acid affinity ligands allowing the simultaneous selection and removal of a large 
number of preselected nucleic acids from the sample. Methods of producing such affinity 
matrices are also provided. In general the methods involve the steps of a) providing a nucleic 
acid amplification template array comprising a surface to which are attached at least 50 

25 oligonucleotides having different nucleic acid sequences, and wherein each different 
oligonucleotide is localized in a predetermined region of said surface, the density of said 
oligonucleotides is greater than about 60 different oligonucleotides per 1 cm.sup.2, and all of 
said different oligonucleotides have an identical terminal 3" nucleic acid sequence and an 
identical terminal 5' nucleic acid sequence, b) amplifying said multiplicity of oligonucleotides 

30 to provide a pool of amplified nucleic acids; and c) attaching the pool of nucleic acids to a 
solid support. 

For example, nucleic acid affinity chromatography is based on the tendency of 
complementary, single-stranded nucleic acids to form a double-stranded or duplex structure 
35 through complementary base pairing. A nucleic acid (either DNA or RNA) can easily be 
attached to a solid substrate (matrix) where it acts as an immobilized ligand that interacts 
with and forms duplexes with complementary nucleic acids present in a solution contacted to 
the immobilized ligand. Unbound components can be washed away from the bound complex 
to either provide a solution lacking the target molecules bound to the affinity column, or to 
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provide the isolated target molecules themselves. The nucleic acids captured in a hybrid 
duplex can be separated and released from the affinity matrix by denaturation either through 
heat, adjustment of salt concentration, or the use of a destabilizing agent such as 
formamide, TWEEN.TM.-20 denaturing agent, or sodium dodecyl sulfate (SDS). 

5 

Affinity columns (matrices) are typically used either to isolate a single nucleic acid typically 
by providing a single species of affinity ligand. Alternatively, affinity columns bearing a single 
affinity ligand (e.g. oligo dt columns) have been used to isolate a multiplicity of nucleic acids 
where the nucleic acids all share a common sequence (e.g. a polyA). 

10 

The type of affinity matrix used depends on the purpose of the analysis. For example, where 
it is desired to analyze mRNA expression levels of particular genes in a complex nucleic acid 
sample (e.g., total mRNA) it is often desirable to eliminate nucleic acids produced by genes 
that are constitutively overexpressed and thereby tend to mask gene products expressed at 

15 characteristically lower levels. Thus, in one embodiment, the affinity matrix can be used to 
remove a number of preselected gene products (e.g., actin, GAPDH, etc.). This is 
accomplished by providing an affinity matrix bearing nucleic acid affinity ligands 
complementary to the gene products (e.g., mRNAs or nucleic acids derived therefrom) or to 
subsequences thereof. Hybridization of the nucleic acid sample to the affinity matrix will 

20 result in duplex formation between the affinity ligands and their target nucleic acids. Upon 
elution of the sample from the affinity matrix, the matrix will retain the duplexes nucleic acids 
leaving a sample depleted of the overexpressed target nucleic acids. 



The affinity matrix can also be used to identify unknown mRNAs or cDNAs in a sample. 

25 Where the affinity matrix contains nucleic acids complementary to every known gene (e.g., in 
a cDNA library, DNA reverse transcribed from an mRNA, mRNA used directly or amplified, 
or polymerized from a DNA template) in a sample, capture of the known nucleic acids by the 
affinity matrix leaves a sample enriched for those nucleic acid sequences that are unknown. 
In effect, the affinity matrix is used to perform a subtractive hybridization to isolate unknown 

30 nucleic acid sequences. The remaining "unknown" sequences can then be purified and 
sequenced according to standard methods. 



The affinity matrix can also be used to capture (isolate) and thereby purify unknown nucleic 
acid sequences. For example, an affinity matrix can be prepared that contains nucleic acid 
35 (affinity ligands) that are complementary to sequences not previously identified, or not 
previously known to be expressed in a particular nucleic acid sample. The sample Is then 
hybridized to the affinity matrix and those sequences that are retained on the affinity matrix 
are "unknown" nucleic acids. The retained nucleic acids can be eluted from the rhatrix (e.g. 
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at increased temperature, increased destabilizing agent concentration, or decreased salt) 
and the nucleic acids can then be sequenced according to standard methods. 

Similarly, the affinity matrix can be used to efficiently capture (isolate) a number of known 
nucleic acid sequences. Again, the matrix is prepared bearing nucleic acids complementary 
to those nucleic acids it is desired to isolate. The sample is contacted to the matrix under 
conditions where the complementary nucleic acid sequences hybridize to the affinity ligands 
in the matrix. The non-hybridized material is washed off the matrix leaving the desired 
sequences bound. The hybrid duplexes are then denatured providing a pool of the isolated 
nucleic acids. The different nucleic acids in the pool can be subsequently separated 
according to standard methods (e.g. gel electrophoresis). 

As indicated above the affinity matrices can be used to selectively remove nucleic acids from 
virtually any sample containing nucleic acids (e.g. in a cDNA library, DNA reverse 
transcribed from an mRNA, mRNA used directly or amplified, or polymerized from a DNA 
template, and so forth). The nucleic acids adhering to the column can be removed by 
washing with a low salt concentration buffer, a buffer containing a destabilizing agent such 
as formamide, or by elevating the column temperature. 

In one particularly preferred embodiment, the affinity matrix can be used in a method to 
enrich a sample for unknown RNA sequences (e.g. expressed sequence tags (ESTs)). The 
method involves first providing an affinity matrix bearing a library of oligonucleotide probes 
specific to known RNA (e.g., EST) sequences. Then, RNA from undifferentiated and/or 
unactivated cells and RNA from differentiated or activated or pathological (e.g., transformed) 
or otherwise having a different metabolic state are separately hybridized against the affinity 
matrices to provide two pools of RNAs lacking the known RNA sequences. 

In a preferred embodiment, the affinity matrix is packed into a columnar casing. The sample 
is then applied to the affinity matrix (e.g. injected onto a column or applied to a column by a 
pump such as a sampling pump driven by an autosampler). The affinity matrix (e.g. affinity 
column) bearing the sample is subjected to conditions under which the nucleic acid probes 
comprising the affinity matrix hybridize specifically with complementary target nucleic acids. 
Such conditions are accomplished by maintaining appropriate pH, salt and temperature 
conditions to facilitate hybridization as discussed above. 

For a number of applications, it may be desirable to extract and separate messenger RNA 
from ceils, cellular debris, and other contaminants. As such, the device of the present 
invention may, in some cases, include a mRNA purification chamber or channel. In general, 
such purification takes advantage of the poly-A tails on mRNA. In particular and as noted 
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above, poly- T oligonucleotides may be immobilized within a chamber or channel of the 
device to serve as affinity ligands for mRNA. Poly-T oligonucleotides may be immobilized 
upon a solid support incorporated within the chamber or channel, or alternatively, may be 
immobilized upon the surface(s) of the chamber or channel itself. Immobilization of 
5 oligonucleotides on the surface of the chambers or channels may be carried out by methods 
described herein including, e.g., oxidation and silanation of the surface followed by standard 
DMT synthesis of the oligonucleotides. 

In operation, the lysed sample is introduced to a high salt solution to increase the ionic 
1 0 strength for hybridization, whereupon the mRNA will hybridize to the immobilized poly-T. The 
mRNA bound to the immobilized poly-T oligonucleotides is then washed free in a low ionic 
strength buffer. The poy-T oligonucleotides may be immobiliized upon poroussurfaces, e.g., 
porous silicon, zeolites silica xerogels, scintered particles, or other solid supports. 

15 Following sample preparation, the sample can be subjected to one or more different analysis 
operations. A variety of analysis operations may generally be performed, including size 
based analysis using, e.g., microcapillary electrophoresis, and/or sequence based analysis 
using, e.g., hybridization to an oligonucleotide array. 

20 In the latter case, the nucleic acid sample may be probed using an array of oligonucleotide 
probes. Oligonucleotide arrays generally include a substrate having a large number of 
positionally distinct oligonucleotide probes attached to the substrate. These arrays may be 
produced using mechanical or light directed synthesis methods which incorporate a 
combination of photolithographic methods and solid phase oligonucleotide synthesis 

25 methods. 

The basic strategy for light directed synthesis of oligonucleotide arrays is as follows. The 
surface of a solid support, modified with photosensitive protecting groups is illuminated 
through a photolithographic mask, yielding reactive hydroxyl groups in the illuminated 

30 regions. A selected nucleotide, typically in the form of a 3-O-phosphoramidite-activated 
deoxynucleoside (protected at the 5 f hydroxyl with a photosensitive protecting group), is then 
presented to the surface and coupling occurs at the sites that were exposed to light. 
Following capping and oxidation, the substrate is rinsed and the surface is illuminated 
through a second mask to expose additional hydroxyl groups for coupling. A second selected 

35 nucleotide (e.g., 5 -protected, 3-O-phosphoramidite-activated deoxynucleoside) is presented 
to the surface. The selective deprotection and coupling cycles are repeated until the desired 
set of products is obtained. Since photolithography is used the process can be readily 
miniaturized to generate high density arrays of oligonucleotide probes. Furthermore, the 
sequence of the oligonucleotides at each site is known. See Pease et al. Mechanical 
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synthesis methods are similar to the light directed methods except involving mechanical 
direction of fluids for deprotection and addition in the synthesis steps. 

For some embodiments, oligonucleotide arrays may be prepared having all possible probes 
5 of a given length. The hybridization pattern of the target sequence on the array may be used 
to reconstruct the target DNA sequence. Hybridization analysis of large numbers of probes 
can be used to sequence long stretches of DNA or provide an oligonucleotide array which is 
specific and complementary to a particular nucleic acid sequence. For example, in 
particularly preferred aspects, the oligonucleotide array will contain oligonucleotide probes 
10 which are complementary to specific target sequences, and individual or multiple mutations 
of these. Such arrays are particularly useful in the diagnosis of specific disorders which are 
characterized by the presence of a particular nucleic acid sequence. 

Following sample collection and nucleic acid extraction, the nucleic acid portion of the 
15 sample is typically subjected to one or more preparative reactions. These preparative 
reactions include in vitro transcription, labeling, fragmentation, amplification and other 
reactions. Nucleic acid amplification increases the number of copies of the target nucleic 
acid sequence of interest. A variety of amplification methods are suitable for use in the 
methods and device of the present invention, including for example, the polymerase chain 
20 reaction method or (PCR), the ligase chain reaction (LCR), self sustained sequence 
replication (3SR), and nucleic acid based sequence amplification (NASBA). 

The latter two amplification methods involve isothermal reactions based on isothermal 
transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA 
25 (dsDNA) as the amplification products in a ratio of approximately 30 or 100 to 1 , respectively. 
As a result, where these latter methods are employed, sequence analysis may be carried out 
using either type of substrate, i.e. complementary to either DNA or RNA. 

Frequently, it is desirable to amplify the nucleic acid sample prior to hybridization. One of 
30 skill in the art will appreciate that whatever amplification method is used, if a quantitative 
result is desired, care must be taken to use a method that maintains or controls for the 
relative frequencies of the amplified nucleic acids. 

PCR 

35 Methods of "quantitative" amplification are well known to those of skill in the art. For 
example, quantitative PCR involves simultaneously co-amplifying a known quantity of a 
control sequence using the same primers. This provides an internal standard that may be 
used to calibrate the PCR reaction. The high density array may then include probes specific 
to the internal standard for quantification of the amplified nucleic acid. 
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Thus, in one embodiment, this invention provides for a method of optimizing a probe set for 
detection of a particular gene. Generally, this method involves providing a high density array 
containing a multiplicity of probes of one or more particular length(s) that are complementary 
5 to subsequences of the mRNA transcribed by the target gene. In one embodiment the high 
density array may contain every probe of a particular length that Is complementary to a 
particular mRNA. The probes of the high density array are then hybridized with their target 
nucleic acid alone and then hybridized with a high complexity, high concentration nucleic 
acid sample that does not contain the targets complementary to the probes. Thus, for 

10 example, where the target nucleic acid is an RNA, the probes are first hybridized with their 
target nucleic acid alone and then hybridized with RNA made from a cDNA library (e.g., 
reverse transcribed polyA.sup.+ mRNA) where the sense of the hybridized RNA is opposite 
that of the target nucleic acid (to insure that the high complexity sample does not contain 
targets for the probes). Those probes that show a strong hybridization signal with their target 

15 and little or no cross-hybridization with the high complexity sample are preferred probes for 
use in the high density arrays of this invention. 

PCR amplification generally involves the use of one strand of the target nucleic acid 
sequence as a template for producing a large number of complements to that sequence. 

20 Generally, two primer sequences complementary to different ends of a segment of the 
complementary strands of the target sequence hybridize with their respective strands of the 
target sequence, and in the presence of polymerase enzymes and nucleoside triphosphates, 
the primers are extended along the target sequence. The extensions are melted from the 
target sequence and the process is repeated, this time with the additional copies of the 

25 target sequence synthesized in the preceding steps. PCR amplification typically involves 
repeated cycles of denaturation, hybridization and extension reactions to produce sufficient 
amounts of the target nucleic acid. The first step of each cycle of the PCR involves the 
separation of the nucleic acid duplex formed by the primer extension. Once the strands are 
separated, the next step in PCR involves hybridizing the separated strands with primers that 

30 flank the target sequence. The primers are then extended to form complementary copies of 
the target strands. For successful PCR amplification, the primers are designed so that the 
position at which each primer hybridizes along a duplex sequence is such that an extension 
product synthesized from one primer, when separated from the template (complement), 
serves as a template for the extension of the other primer. The cycle of denaturation, 

35 hybridization, and extension is repeated as many times as necessary to obtain the desired 
amount of amplified nucleic acid. 

In PCR methods, strand separation is normally achieved by heating the reaction to a 
sufficiently high temperature for a sufficient time to cause the denaturation of the duplex but 
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not to cause an irreversible denaturation of the polymerase. Typical heat denaturation 
involves temperatures ranging from about 80.degree. C. to 105.degree. C. for times ranging 
from seconds to minutes. Strand separation, however, can be accomplished by any suitable 
denaturing method including physical, chemical, or enzymatic means. Strand separation may 
5 be induced by a helicase, for example, or an enzyme capable of exhibiting helicase activity. 

In addition to PGR and IVT reactions, the methods and devices of the present invention are 
also applicable to a number of other reaction types, e.g., reverse transcription, nick 
translation, and the like. 

10 

The nucleic acids in a sample will generally be labeled to facilitate detection in subsequent 
steps. Labeling may be carried out during the amplification, in vitro transcription or nick 
translation processes. In particular, amplification, in vitro transcription or nick translation may 
incorporate a label into the amplified or transcribed sequence, either through the use of 
1 5 labeled primers or the incorporation of labeled dNTPs into the amplified sequence. 

Hybridization between the sample nucleic acid and the oligonucleotide probes upon the 
array is then detected, using, e.g., epifluorescence confocal microscopy. Typically, sample is 
mixed during hybridization to enhance hybridization of nucleic acids in the sample to nucleoc 
acid probes on the array. 

20 

In some cases, hybridized oligonucleotides may be labeled following hybridization. For 
example, where biotin labeled dNTPs are used in, e.g. amplification or transcription, 
streptavidin linked reporter groups may be used to label hybridized complexes. Such 
operations are readily integratable into the systems of the present invention. Alternatively, 
25 the nucleic acids in the sample may be labeled following amplification. Post amplification 
labeling typically involves the covalent attachment of a particular detectable group upon the 
amplified sequences. Suitable labels or detectable groups include a variety of fluorescent or 
radioactive labeling groups well known in the art. These labels may also be coupled to the 
sequences using methods that are well known in the art. 

30 

Methods for detection depend upon the label selected. A fluorescent label is preferred 
because of its extreme sensitivity and simplicity. Standard labeling procedures are used to 
determine the positions where interactions between a sequence and a reagent take place. 
For example, if a target sequence is labeled and exposed to a matrix of different probes, only 
35 those locations where probes do interact with the target will exhibit any signal. Alternatively, 
other methods may be used to scan the matrix to determine where interaction takes place. 
Of course, the spectrum of interactions may be determined in a temporal manner by 
repeated scans of interactions which occur at each of a multiplicity of conditions. However, 
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instead of testing each individual interaction separately, a multiplicity of sequence 
interactions may be simultaneously determined on a matrix. 



Means of detecting labeled target (sample) nucleic acids hybridized to the probes of the high 
5 density array are known to those of skill in the art. Thus, for example, where a colorimetric 
label is used, simple visualization of the label is sufficient. Where a radioactive labeled probe 
is used, detection of the radiation (e.g with photographic film or a solid state detector) is 
sufficient. 



10 In a preferred embodiment, however, the target nucleic acids are labeled with a fluorescent 
label and the localization of the label on the probe array is accomplished with fluorescent 
microscopy. The hybridized array is excited with a light source at the excitation wavelength 
of the particular fluorescent label and the resulting fluorescence at the emission wavelength 
is detected. In a particularly preferred embodiment, the excitation light source is a laser 

1 5 appropriate for the excitation of the fluorescent label. 



The target polynucleotide may be labeled by any of a number of convenient detectable 
markers. A fluorescent label is preferred because it provides a very strong signal with low 
background. It is also optically detectable at high resolution and sensitivity through a quick 
20 scanning procedure. Other potential labeling moieties include, radioisotopes, 
chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic 
markers, magnetic labels, and linked enzymes. 

Another method for labeling may bypass any label of the target sequence. The target may be 
exposed to the probes, and a double strand hybrid is formed at those positions only. Addition 
25 of a double strand specific reagent will detect where hybridization takes place. An 
intercalative dye such as ethidium bromide may be used as long as the probes themselves 
do not fold back on themselves to a significant extent forming hairpin loops. However, the 
length of the hairpin loops in short oligonucleotide probes would typically be insufficient to 
form a stable duplex. 

30 

Suitable chromogens will include molecules and compounds which absorb light in a 
distinctive range of wavelengths so that a color may be observed, or emit light when 
irradiated with radiation of a particular wave length or wave length range, e.g., fluorescers. 
Biiiproteins, e.g., phycoerythrin, may also serve as labels. 

35 

A wide variety of suitable dyes are available, being primarily chosen to provide an intense 
color with minimal absorption by their surroundings. Illustrative dye types include quinoline 
dyes, triarylmethane dyes, acridine dyes, alizarine dyes, phthaleins, insect dyes, azo dyes, 
anthraquinoid dyes, cyanine dyes, phenazathionium dyes, and phenazoxonium dyes. 
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A wide variety of fluorescers may be employed either by themselves or in conjunction with 
quencher molecules. Fluorescers of interest fall into a variety of categories having certain 
primary functionalities. These primary functionalities include 1- and 2-aminonaphthalene, 
5 p,p'-diaminostilbenes, pyrenes, quaternary phenanthridine salts, 9-aminoacridines, p,p'- 
diaminobenzophenone imines, anthracenes, oxacarbocyanine, merocyanine, 3- 
aminoequilenin, perylene, bis-benzoxazole, bis-p-oxazolyl benzene, 1 ,2-benzophenazin, 
retinol, bis-3-aminopyridinium salts, hellebrigenin, tetracycline, sterophenol, 
benzimidzaoiylphenylamine, 2-oxo-3-chromen, indole, xanthen, 7-hydroxycoumarin, 

10 phenoxazine, salicylate, strophanthidin, porphyrins, triarylmethanes and flavin. Individual 
fluorescent compounds which have functionalities for linking or which can be modified to 
incorporate such functionalities include, e.g., dansyl chloride; fluoresceins such as 3,6- 
dihydroxy-9-phenylxanthhydroI; rhodamineisothiocyanate; N-phenyl 1 -amino-8- 
sulfonatonaphthalene; N-phenyl 2-amino-6-sulfonatonaphthalene; 4-acetamido-4- 

15 isothiocyanato-stilbene-2,2'-disulfonic acid; pyrene-3-sulfonic acid; 2-toluidinonaphthalene-6- 
sulfonate; N-phenyl, N-methyl 2-aminoaphthalene-6-sulfonate; ethidium bromide; stebrine; 
auromine-0 ( 2-(9'-anthroyl)palmitate; dansyl phosphatidylethanolamine; N,N'-dioctadecyl 
oxacarbocyanine; N,N'-dihexyl oxacarbocyanine; merocyanine, 4-(3'pyrenyl)butyrate; d-3- 
aminodesoxy-equilenin; 1 2-(9'-anthroyl)stearate; 2-methylanthracene; 9-vinylanthracene; 

20 2,2'-(vinylene-p-phenylene)bisbenzoxazole; p-bis>2-(4-methyl-5-phenyl-oxazolyl)!benzene; 6- 
dimethylamino-1 ,2-benzophenazin; retinol; bis(3'-aminopyridinium) 1,10-decandiyl diiodide; 
sulfonaphthylhydrazone of hellibrienin; chlorotetracycline; N-(7-dimethylamino-4-methyl-2- 
oxo-3-chromenyl)maleimide; N->p-(2-benzimidazolyi)-phenyl!maleimide; N-(4- 

fluoranthyl)maleimide; bis(homovanillic acid); resazarin; 4-ch!oro-7-nitro-2,1 ,3- 

25 benzooxadiazole; merocyanine 540; resorufin; rose bengal; and 2,4-diphenyl-3(2H)- 
furanone. 



Desirably, fluorescers should absorb light above about 300 nm, preferably about 350 nm, 
and more preferably above about 400 nm, usually emitting at wavelengths greater than 
30 about 10 nm higher than the wavelength of the light absorbed. It should be noted that the 
absorption and emission characteristics of the bound dye may differ from the unbound dye. 
Therefore, when referring to the various wavelength ranges and characteristics of the dyes, it 
is intended to indicate the dyes as employed and not the dye which is unconjugated and 
characterized in an arbitrary solvent. 

35 

Fluorescers are generally preferred because by irradiating a fluorescer with light, one can 
obtain a plurality of emissions. Thus, a single label can provide for a plurality of measurable 
events. 
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Detectable signal may also be provided by chemiluminescent and biolumlnescent sources. 
Chemiluminescent sources include a compound which becomes electronically excited by a 
chemical reaction and may then emit light which serves as the detectible signal or donates 
energy to a fluorescent acceptor. A diverse number of families of compounds have been 
5 found to provide chemiluminescence under a variety of conditions. One family of compounds 
is 2,3-dihydro-1 ,-4-phthalazinedione. The most popular compound is luminol, which is the 5- 
amino compound. Other members of the family include the 5-amino-6,7,8-trimethoxy- and 
the dimethylamino>calbenz analog. These compounds can be made to luminesce with 
alkaline hydrogen peroxide or calcium hypochlorite and base. Another family of compounds 

10 is the 2,4,5-triphenylimidazoles, with lophine as the common name for the parent product. 
Chemiluminescent analogs include para-dimethylamino and -methoxy substituents. 
Chemiluminescence may also be obtained with oxalates, usually oxalyl active esters, e.g., p- 
nitrophenyl and a peroxide, e.g., hydrogen peroxide, under basic conditions. Alternatively, 
luciferins may be used in conjunction with luciferase or lucigenins to provide 

1 5 bioluminescence. 

Spin labels are provided by reporter molecules with an unpaired electron spin which can be 
detected by electron spin resonance (ESR) spectroscopy. Exemplary spin labels include 
organic free radicals, transitional metal complexes, particularly vanadium, copper, iron, and 
20 manganese, and the like. Exemplary spin labels include nitroxide free radicals. 

In addition, amplified sequences may be subjected to other post amplification treatments. 
For example, in some cases, it may be desirable to fragment the sequence prior to 
hybridization with an oligonucleotide array, in order to provide segments which are more 
25 readily accessible to the probes, which avoid looping and/or hybridization to multiple probes. 
Fragmentation of the nucleic acids may generally be carried out by physical, chemical or 
enzymatic methods that are known in the art. 

Following the various sample preparation operations, the sample will generally be subjected 
30 to one or more analysis operations. Particularly preferred analysis operations include, e.g. 
sequence based analyses using an oligonucleotide array and/or size based analyses using, 
e.g. microcapillary array electrophoresis. 

In some embodiments it may be desirable to provide an additional, or alternative means for 
35 analyzing the nucleic acids from the sample 

Microcapillary array electrophoresis generally involves the use of a thin capillary or channel 
which may or may not be filled with a particular separation medium. Electrophoresis of a 
sample through the capillary provides a size based separation profile for the sample. 
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Microcapillary array electrophoresis generally provides a rapid method for size based 
sequencing, PCR product analysis and restriction fragment sizing. The high surface to 
volume ratio of these capillaries allows for the application of higher electric fields across the 
capillary without substantial thermal variation across the capillary, consequently allowing for 
5 more rapid separations. Furthermore, when combined with confocal imaging methods these 
methods provide sensitivity in the range of attomoles, which is comparable to the sensitivity 
of radioactive sequencing methods. 

In many capillary electrophoresis methods, the capillaries e.g. fused silica capillaries or 
10 channels etched, machined or molded into planar substrates, are filled with an appropriate 
separation/sieving matrix. Typically, a variety of sieving matrices are known in the art may be 
used in the microcapillary arrays. Examples of such matrices include, e.g. hydroxyethyl 
cellulose, polyacrylamide and agarose. Gel matrices may be introduced and polymerized 
within the capillary channel. However, in some cases this may result in entrapment of 
15 bubbles within the channels which can interfere with sample separations. Accordingly, it is 
often desirable to place a preformed separation matrix within the capillary channel(s), prior to 
mating the planar elements of the capillary portion. Fixing the two parts, e.g. through sonic 
welding, permanently fixes the matrix within the channel. Polymerization outside of the 
channels helps to ensure that no bubbles are formed. Further, the pressure of the welding 
20 process helps to ensure a void-free system. 

In addition to its use in nucleic acid "fingerprinting" and other sized based analyses the 
capillary arrays may also be used in sequencing applications. In particular, gel based 
sequencing techniques may be readily adapted for capillary array electrophoresis. 

25 

In addition to detection of mRNA or as the sole detection method expression products from 
the genes discussed above may be detected as indications of the biological condition of the 
tissue. Expression products may be detected in either the tissue sample as such, or in a 
body fluid sample, such as blood, serum, plasma, faeces, mucus, sputum, cerebrospinal 
30 fluid, and/or urine of the individual. 

The expression products, peptides and proteins, may be detected by any suitable technique 
known to the person skilled in the art. 

35 In a preferred embodiment the expression products are detected by means of specific 
antibodies directed to the various expression products, such as immunofluorescent and/or 
Immunohistochemical staining of the tissue. 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




T/DK2003/000750 



54 

Immunohistochemical localization of expressed proteins may be carried out by 
immunostaining of tissue sections from the single tumors to determine which cells expressed 
the protein encoded by the transcript in question. The transcript levels may be used to select 
a group of proteins supposed to show variation from sample to sample making a rough 
5 correlation between the level of protein detected and the intensity of the transcript on the 
microarray possible. 

For example sections may be cut from paraffin-embedded tissue blocks, mounted, and 
deparaffinized by incubation at 80 C° for 10 min. followed by immersion in heated oil at 60° C 

10 for 10 min. (Estisol 312, Estichem A/S, Denmark) and rehydration. Antigen retrieval is 
achieved in TEG (TrisEDTA-Glycerol) buffer using microwaves at 900 W. The tissue 
sections may be cooled in the buffer for 15 min before a brief rinse in tap water. Endogenous 
peroxidase activity is blocked by incubating the sections with 1% H202 for 20 min. followed 
by three rinses in tap water, 1 min each. The sections may then be soaked in PBS buffer for 

15 2 min. The next steps can be modified from the descriptions given by Oncogene Science 
Inc., in the Mouse Immunohistochemistry Detection System, XHCOI (UniTect, Uniondale, 
NY, USA). Briefly, the tissue sections are incubated overnight at 4° C with primary antibody 
(against beta-2 microglobulin (Dako), cytokeratln 8, cystatin-C (both from Europa, US), junB, 
CD59, E-cadherin, apo-E, cathepsin E, vimentin, IGFII (all from Santa Cruz), followed by 

20 three rinses in PBS buffer for 5 min each. Afterwards, the sections are incubated with 
biotinylated secondary antibody for 30 min, rinsed three times with PBS buffer and 
subsequently incubated with ABC (avidin-biotinlylated horseradish peroxidase complex) for 
30 min. followed by three rinses in PBS buffer. 

25 Staining may be performed by incubation with AEC (3-amino-ethylcarbazole) for 10 min. The 
tissue sections are counter stained with Mayers hematoxylin, washed in tap water for 5 min. 
and mounted with glycerol-gelatin. Positive and negative controls may be included in each 
staining round with all antibodies. 

30 In yet another embodiment the expression products may be detected by means of 
conventional enzyme assays, such as ELISA methods. 

Furthermore, the expression products may be detected by means of peptide/protein chips 
capable of specifically binding the peptides and/or proteins assessed. Thereby an 
35 expression pattern may be obtained. 

Assay 
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In a further aspect the invention relates to an assay for predicting the prognosis of a 
biological condition in animal tissue, comprising 

at least one first marker capable of detecting an expression level of at least one gene se- 
5 lected from the group of genes consisting of gene No. 1 to gene No. 562. 

Preferably the assay further comprises means for correlating the expression level to at least 
one standard expression level and/or at least one reference pattern. 

1 0 The means for correlating preferably includes one or more standard expression levels and/or 
reference patterns for use in comparing or correlating the expression levels or patterns ob- 
tained from a tumor under examination to the standards. 

Preferably the invention relates to an assay for determining an expression pattern of a blad- 
15 der cell, comprising at least a first marker and/or a second marker, wherein the first marker is 
capable of detecting a gene from a first gene group as defined above, and/or the second 
marker is capable of detecting a gene from a second gene group as defined above, correlat- 
ing the first expression level and/or the second expression level to a standard level of the 
assessed genes to predict the prognosis of a biological condition in the animal tissue. 
20 The marker(s) are preferably specifically detecting a gene as identified herein. 

As described above, it is preferred to determine the expression level from more than one 
gene, and correspondingly, it is preferred to include more than one marker in the assay, 
such as at least two markers, such as at least three markers, such as at least four markers, 
25 such as at least five markers, such as at least six markers, such as at least seven markers, 
such as at least eight markers, such as at least nine markers, such as at least ten markers, 
such as at least 15 markers. 

When using markers for at least two different groups, it is preferred that the above number of 
30 markers relate to markers in each group. 

As discussed above the marker may be any nucleotide probe, such as a DNA, RNA, PNA, or 
LNA probe capable of hybridising to mRNA indicative of the expression level. The hybridisa- 
tion conditions are preferably as described below for probes. In another embodiment the 
35 marker is an antibody capable of specifically binding the expression product in question. 

Patterns can be compared manually by a person or by a computer or other machine. An 
algorithm can be used to detect similarities and differences. The algorithm may score and 
compare, for example, the genes which are expressed and the genes which are not 
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expressed. Alternatively, the algorithm may look for changes in intensity of expression of a 
particular gene and score changes in intensity between two samples. Similarities may be 
determined on the basis of genes which are expressed in both samples and genes which are 
not expressed in both samples or on the basis of genes whose intensity of expression are 
5 numerically similar. 

Generally, the detection operation will be performed using a reader device external to the 
diagnostic device. However, it may be desirable in some cases to incorporate the data 
gathering operation into the diagnostic device itself. 

10 

The detection apparatus may be a fluorescence detector, or a spectroscopic detector, or 
another detector. 

Although hybridization is one type of specific interaction which is clearly useful for use in this 
1 5 mapping embodiment antibody reagents may also be very useful. 

Gathering data from the various analysis operations, e.g. oligonucleotide and/or 
microcapillary arrays will typically be carried out using methods known in the art. For 
example, the arrays may be scanned using lasers to excite fluorescently labeled targets that 
20 have hybridized to regions of probe arrays mentioned above, which can then be imaged 
using charged coupled devices ("CCDs") for a wide field scanning of the array. Alternatively, 
another particularly useful method for gathering data from the arrays is through the use of 
laser confocal microscopy which combines the ease and speed of a readily automated 
process with high resolution detection. 

25 

Following the data gathering operation, the data will typically be reported to a data analysis 
operation. To facilitate the sample analysis operation, the data obtained by the reader from 
the device will typically be analyzed using a digital computer. Typically, the computer will be 
appropriately programmed for receipt and storage of the data from the device, as well as for 
30 analysis and reporting of the data gathered, i.e., interpreting fluorescence data to determine 
the sequence of hybridizing probes, normalization of background and single base mismatch 
hybridizations, ordering of sequence data in SBH applications, and the like. 

The invention also relates to a pharmaceutical composition for treating a biological condition, 
35 such as bladder tumors. 

In one embodiment the pharmaceutical composition comprises one or more of the peptides 
being expression products as defined above. In a preferred embodiment, the peptides are 
bound to carriers. The peptides may suitably be coupled to a polymer carrier, for example a 
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protein carrier, such as BSA. Such formulations are well-known to the person skilled in the 
art. 

The peptides may be suppressor peptides normally lost or decreased in tumor tissue admin- 
5 istered in order to stabilise tumors towards a less malignant stage. In another embodiment 
the peptides are onco-peptides capable of eliciting an immune response towards the tumor 
cells. 

In another embodiment the pharmaceutical composition comprises genetic material, either 
1 0 genetic material for substitution therapy, or for suppressing therapy as discussed below. 

In a third embodiment the pharmaceutical composition comprises at least one antibody pro- 
duced as described above. 

15 In the present context the term pharmaceutical composition is used synonymously with the 
term medicament. The medicament of the invention comprises an effective amount of one or 
more of the compounds as defined above, or a composition as defined above in combination 
with pharmaceutical^ acceptable additives. Such medicament may suitably be formulated 
for oral, percutaneous, intramuscular, intravenous, intracranial, intrathecal, intracerebroven- 

20 tricular, intranasal or pulmonal administration. For most indications a localised or substan- 
tially localised application is preferred. 

Strategies in formulation development of medicaments and compositions based on the com- 
pounds of the present invention generally correspond to formulation strategies for any other 
25 protein-based drug product. Potential problems and the guidance required to overcome 
these problems are dealt with in several textbooks, e.g. "Therapeutic Peptides and Protein 
Formulation. Processing and Delivery Systems", Ed. A.K. Banga, Technomic Publishing AG, 
Basel, 1995. 

30 injectables are usually prepared either as liquid solutions or suspensions, solid forms suit- 
able for solution in, or suspension in, liquid prior to injection. The preparation may also be 
emulsified. The active ingredient is often mixed with excipients which are pharmaceutical^ 
acceptable and compatible with the active ingredient. Suitable excipients are, for example, 
water, saline, dextrose, glycerol, ethanol or the like, and combinations thereof. In addition, if 

35 desired, the preparation may contain minor amounts of auxiliary substances such as wetting 
or emulsifying agents, pH buffering agents, or which enhance the effectiveness or transpor- 
tation of the preparation. 

Formulations of the compounds of the invention can be prepared by techniques known to the 
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person skilled in the art. The formulations may contain pharmaceutical^ acceptable carriers 
and excipients including microspheres, liposomes, microcapsules and nanoparticles. 

The preparation may suitably be administered by injection, optionally at the site, where the 
5 active ingredient is to exert its effect. Additional formulations which are suitable for other 
modes of administration include suppositories, and in some cases, oral formulations. For 
suppositories, traditional binders and carriers include polyalkylene glycols or triglycerides. 
Such suppositories may be formed from mixtures containing the active ingredient(s) in the 
range of from 0.5% to 10%, preferably 1-2%. Oral formulations include such normally em- 
10 ployed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, mag- 
nesium stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. These 
compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained re- 
lease formulations or powders and generally contain 10-95% of the active ingredient(s), 
preferably 25-70%. 

15 

The preparations are administered in a manner compatible with the dosage formulation, and 
in such amount as will be therapeutically effective. The quantity to be administered depends 
on the subject to be treated, including, e.g. the weight and age of the subject, the disease to 
be treated and the stage of disease. Suitable dosage ranges are of the order of several hun- 

20 dred pg active ingredient per administration with a preferred range of from about 0.1 pg to 
1000 pg, such as in the range of from about 1 pg to 300 pg, and especially in the range of 
from about 10 pg to 50 pg. Administration may be performed once or may be followed by 
subsequent administrations. The dosage will also depend on the route of administration and 
will vary with the age and weight of the subject to be treated. A preferred dosis would be in 

25 the interval 30 mg to 70 mg per 70 kg body weight. 

Some of the compounds of the present invention are sufficiently active, but for some of the 
others, the effect will be enhanced if the preparation further comprises pharmaceutically 
acceptable additives and/or carriers. Such additives and carriers will be known in the art. In 
30 some cases, it will be advantageous to include a compound, which promote delivery of the 
active substance to its target. 

In many instances, it will be necessary to administrate the formulation multiple times. Ad- 
ministration may be a continuous infusion, such as intraventricular infusion or administration 
35 in more doses such as more times a day, daily, more times a week, weekly, etc. 
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Vaccines 

In a further embodiment the present invention relates to a vaccine for the prophylaxis or 
treatment of a biological condition comprising at least one expression product from at least 
one gene said gene being expressed as defined above. 

The term vaccines is used with its normal meaning, i.e preparations of immunogenic material 
for administration to induce in the recipient an immunity to infection or intoxication by a given 
infecting agent. Vaccines may be administered by intravenous injection or through oral, na- 
sal and/or mucosal administration. Vaccines may be either simple vaccines prepared from 
one species of expression products, such as proteins or peptides, or a variety of expression 
products, or they may be mixed vaccines containing two or more simple vaccines. They are 
prepared in such a manner as not to destroy the immunogenic material, although the meth- 
ods of preparation vary, depending on the vaccine. 

The enhanced immune response achieved according to the invention can be attributable to 
e.g. an enhanced increase in the level of immunoglobulins or in the level of T-cells including 
cytotoxic T-cells will result in immunisation of at least 50% of individuals exposed to said 
immunogenic composition or vaccine, such as at least 55%, for example at least 60%, such 
as at least 65%, for example at least 70%, for example at least 75%, such as at least 80%, 
for example at least 85%, such as at least 90%, for example at least 92%, such as at least 
94%, for example at least 96%, such as at least 97%, for example at least 98%, such as at 
least 98.5%, for example at least 99%, for example at least 99.5% of the individuals exposed 
to said immunogenic composition or vaccine are immunised. 

Compositions according to the invention may also comprise any carrier and/or adjuvant 
known in the art including functional equivalents thereof. Functionally equivalent carriers are 
capable of presenting the same immunogenic determinant in essentially the same steric 
conformation when used under similar conditions. Functionally equivalent adjuvants are ca- 
pable of providing similar increases in'the efficacy of the composition when used under simi- 
lar conditions. 

Therapy 

The invention further relates to a method of treating individuals suffering from the biological 
condition in question, in particular for treating a bladder tumor. 

Accordingly, the invention relates to a method for reducing cell tumorigenicity or malignancy 
of a cell, said method comprising contacting a tumor cell with at least one peptide expressed 
by at least one gene selected from the group of genes consisting of gene No. 200-214, 233, 
234, 235, 236, 244, 249, 251, 252, 255, 256, 259, 261, 262, 266, 268, 269, 273, 274, 275, 
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276, 277, 279, 280, 281, 282, 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 307, 308, 
311, 312, 313, 314 , 320 , 322, 323, 325, 326, 327, 328 , 330, 331. 332, 333, 334, 338, 341, 
342, 343, 345, 348, 349, 350, 351, 352, 353, 355, 357, 360, 361, 363, 366, 367, 370, 373, 
374, 375, 376, 385, 386, 387, 389, 390, 392, 394, 398, 400, 401, 405, 406, 407, 408, 410, 
5 411, 412, 414, 415, 416. 418, 424, 426, 428. 433. 434, 435. 436, 438, 439. 440. 441, 442, 
443, 445, 446, 453, 460, 461, 463, 464, 465, 466, 467, 469, 470, 471, 472, 473, 475, 476, 
477, 479, 480, 481, 482, 483, 485, 486, 487. 488. 490, 492, 494, 496, 497, 498 , 499, 503, 
515, 516, 517. 521, 526, 527, 528, 530 ,532, 533, 537, 539, 540, 541, 542, 543, 545, 554, 
557, 560. 

10 

In order to increase the effect several different peptides may be used simultaneously, such 
as wherein the tumor cell is contacted with at least two different peptides: 

In one embodiment the invention relates to a method of substitution therapy, ie. 
15 administration of genetic material generally expressed in normal cells, but lost or decreased 
in biological condition cells (tumor suppressors). Thus, the invention relates to a method for 
reducing cell tumorigenicity or malignancy of a cell, said method comprising 

obtaining at least one gene selected from the group of genes consisting of gene No. 200- 
20 214, 233, 234, 235, 236, 244, 249, 251, 252, 255, 256, 259, 261, 262, 266, 268, 269, 273, 
274, 275, 276, 277, 279, 280, 281, 282, 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 
307, 308, 311, 312, 313, 314 , 320 , 322, 323, 325, 326. 327, 328 , 330, 331, 332, 333, 334, 
338, 341, 342, 343, 345, 348, 349, 350, 351, 352, 353, 355, 357, 360, 361, 363, 366, 367, 
370, 373, 374, 375, 376, 385, 386, 387, 389, 390, 392, 394, 398, 400, 401, 405, 406, 407, 
25 408. 410, 411, 412, 414, 415, 416, 418, 424. 426. 428, 433, 434, 435, 436, 438, 439, 440, 
441, 442, 443, 445, 446, 453, 460, 461, 463, 464, 465, 466, 467, 469, 470, 471, 472, 473, 
475, 476, 477, 479, 480, 481, 482, 483, 485, 486, 487. 488, 490, 492, 494, 496, 497, 498 , 
499, 503, 515, 516, 517, 521, 526, 527, 528, 530 ,532, 533, 537, 539, 540, 541, 542, 543, 
545, 554, 557, 560, 

30 

introducing said at least one gene into the tumor cell in a manner allowing expression of said 
gene(s). 

In one embodiment at least one gene is introduced into the tumor cell. In another 
35 embodiment at least two genes are introduced into the tumor cell. 

In one aspect of the invention small molecules that either inhibit increased gene expression 
or their effects or substitute decreased gene expression or their effects, are introduced to the 
cellular environment or the cells. Application of small molecules to tumor cells may be 
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performed by e.g. local application or intravenous injection or by oral ingestion. Small 
molecules have the ability to restore function of reduced gene expression in tumor or cancer 
tissue. 

5 In another aspect the invention relates to a therapy whereby genes (increase and/or 
decrease) generally are correlated to disease are inhibited by one or more of the following 
methods: 

A method for reducing cell tumorigenicity or malignancy of a cell, said method comprising 

obtaining at least one nucleotide probe capable of hybridising with at least one gene of a 
tumor cell, said at least one gene being selected from the group of genes consisting of gene 
Nos. 1-199, 215-232, 237, 238, 239, 240, 241, 242, 243, 245, 246, 247, 248, 250. 253, 254, 
257, 258, 260, 263, 264, 265, 267, 270, 271, 272, 278, 283, 284, 287, 288, 290, 291, 292, 
294, 297, 298, 300, 302, 303, 305, 309, 310, 315, 316. 317, 318, 319, 321, 324, 329, 335, 
336, 337, 339, 340, 344, 346, 347, 354, 356, 358, 359, 362, 364, 365, 368, 369, 371, 372, 
377, 378, 379, 380, 381, 382, 383, 384, 388, 391, 393, 395, 396, 397, 399, 402, 403, 404, 
409, 413, 417, 419, 420, 421, 422, 423, 425, 427 ,429, 430, 431, 432, 437, 444, 447, 448, 
449, 450, 451, 452, 454, 455 ,456, 457, 458, 459, 462, 468, 474, 478, 484, 489, 491, 493, 
495, 500, 501, 502, 504, 505, 506, 507, 508, 509, 510. 511. 512, 513, 514, 518 , 519, 520, 
522, 523, 524, 525, 529, 531, 534. 535. 536. 538. 544, 546, 547, 548, 549, 550, 551, 552, 
553, 555. 556, 558, 559, 561, 562, 

introducing said at least one nucleotide probe into the tumor cell in a manner allowing the 
probe to hybridise to the at least one gene, thereby inhibiting expression of said at least one 
gene. This method is preferably based on anti-sense technology, whereby the hybridisation 
of said probe to the gene leads to a down-regulation of said gene. 

In another preferred embodiment, the method for reducing cell tumorigenicity or malignancy 
of a cell is based on RNA interference, comprising small interfering RNAs (siRNAs) 
specifically directed against at least one gene being selected from the group of genes 
consisting of gene Nos. 1-199, 215-232, 237, 238, 239, 240, 241, 242, 243, 245, 246, 247, 
248, 250, 253, 254, 257, 258, 260, 263, 264, 265, 267, 270, 271, 272, 278, 283. 284. 287, 
288, 290. 291. 292. 294, 297, 298, 300, 302, 303, 305, 309, 310, 315, 316, 317, 318, 319, 
321, 324, 329, 335, 336, 337, 339, 340, 344, 346, 347, 354. 356, 358, 359, 362, 364, 365, 
368, 369, 371, 372. 377. 378. 379, 380, 381, 382, 383, 384, 388, 391, 393, 395, 396, 397, 
399, 402, 403, 404, 409, 413, 417, 419, 420, 421, 422. 423. 425. 427 ,429, 430, 431, 432, 
437, 444, 447, 448, 449, 450, 451, 452, 454, 455 ,456. 457. 458, 459, 462, 468, 474, 478, 
484, 489, 491, 493. 495, 500, 501, 502, 504, 505, 506, 507. 508, 509, 510, 511, 512, 513, 
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514, 518 , 519, 520, 522, 523, 524, 525, 529, 531. 534, 535, 536, 538, 544, 546, 547, 548, 
549, 550, 551, 552, 553, 555, 556, 558, 559, 561, 562. 

The down-regulation may of course also be based on a probe capable of hybridising to 
5 regulatory components of the genes in question, such as promoters. 

The hybridization may be tested in vitro at conditions corresponding to in vivo conditions. 
Typically, hybridization conditions are of low to moderate stringency. These conditions 
favour specific interactions between completely complementary sequences, but allow some 
10 non-specific interaction between less than perfectly matched sequences to occur as well. 
After hybridization, the nucleic acids can be "washed" under moderate or high conditions of 
stringency to dissociate duplexes that are bound together by some non-specific interaction 
(the nucleic acids that form these duplexes are thus not completely complementary). 

15 As is known in the art, the optimal conditions for washing are determined empirically, often 
by gradually increasing the stringency. The parameters that can be changed to affect strin- 
gency include, primarily, temperature and salt concentration. In general, the lower the salt 
concentration and the higher the temperature the higher the stringency. Washing can be 
initiated at a low temperature (for example, room temperature) using a solution containing a 

20 salt concentration that is equivalent to or lower than that of the hybridization solution. Sub- 
sequent washing can be carried out using progressively warmer solutions having the same 
salt concentration. As alternatives, the salt concentration can be lowered and the tempera- 
ture maintained in the washing step, or the salt concentration can be lowered and the tem- 
perature increased. Additional parameters can also be altered. For example, use of a de- 

25 stabilizing agent, such as formamide, alters the stringency conditions. 

In reactions where nucleic acids are hybridized, the conditions used to achieve a given level 
of stringency will vary. There is not one set of conditions, for example, that will allow du- 
plexes to form between all nucleic acids that are 85% Identical to one another; hybridization 
also depends on unique features of each nucleic acid. The length of the sequence, the 
composition of the sequence (for example, the content of purine-like nucleotides versus the 
content of pyrimidine-like nucleotides) and the type of nucleic acid (for example, DNA or 
RNA) affect hybridization. An additional consideration is whether one of the nucleic acids is 
immobilized (for example on a filter). 



30 



35 



An example of a progression from lower to higher stringency conditions is the following, 
where the salt content is given as the relative abundance of SSC (a salt solution containing 
sodium chloride and sodium citrate; 2X SSC Is 10-fold more concentrated than 0.2X SSC). 
Nucleic acids are hybridized at 42°C in 2X SSC/0.1% SDS (sodium dodecylsulfate; a deter- 
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gent) and then washed in 0.2X SSC/0.1% SDS at room temperature (for conditions of low 
stringency); 0.2X SSC/0.1% SDS at 42°C (for conditions of moderate stringency); and 0.1X 
SSC at 68°C (for conditions of high stringency). Washing can be carried out using only one 
of the conditions given, or each of the conditions can be used (for example, washing for 10- 
5 15 minutes each in the order listed above). Any or ail of the washes can be repeated. As 
mentioned above, optimal conditions will vary and can be determined empirically. 

In another aspect a method of reducing tumoregeneicity relates to the use of antibodies 
against an expression product of a cell from the biological tissue. The antibodies may be 
1 0 produced by any suitable method, such as a method comprising the steps of 

obtaining expression product(s) from at least one gene said gene being expressed as 
defined above, 

15 immunising a mammal with said expression product(s) obtaining antibodies against the 
expression product. 

Use 

The methods described above may be used for producing an assay for diagnosing a 
20 biological condition in animal tissue, or for identification of the origin of a piece of tissue. 
Further, the methods of the invention may be used for prediction of a disease course and 
treatment response. 

Furthermore, the invention relates to the use of a peptide as defined above for preparation of 
25 a pharmaceutical composition for the treatment of a biological condition in animal tissue. 

Furthermore, the invention relates to the use of a gene as defined above for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal tissue. 

30 Also, the invention relates to the use of a probe as defined above for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal tissue. 

The genetic material discussed above for may be any of the described genes or functional 
parts thereof. The constructs may be introduced as a single DNA molecule encoding all of 
35 the genes, or different DNA molecules having one or more genes. The constructs may be 
introduced simultaneously or consecutively, each with the same or different markers. 
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The gene may be linked to the complex as such or protected by any suitable system nor- 
mally used for transfection such as viral vectors or artificial viral envelope, liposomes or mi- 
cellas, wherein the system is linked to the complex. 

5 Numerous techniques for Introducing DNA into eukaryotic cells are known to the skilled arti- 
san. Often this is done by means of vectors, and often in the form of nucleic acid encapsi- 
dated by a (frequently virus-like) proteinaceous coat. Gene delivery systems may be applied 
to a wide range of clinical as well as experimental applications. 

10 Vectors containing useful elements such as selectable and/or amplifiable markers, pro- 
moter/enhancer elements for expression in mammalian, particularly human, cells, and which 
may be used to prepare stocks of construct DNAs and for carrying out transfections are well 
known in the art. Many are commercially available. 

15 Various techniques have been developed for modification of target tissue and cells in vivo. A 
number of virus vectors, discussed below, are known which allow transfection and random 
integration of the virus into the host. See, for example, Dubensky et al. (1984) Proc. Natl. 
Acad. Sci. USA 81:7529-7533; Kaneda et al., (1989) Science 243:375-378; Hiebert et al. 
(1989) Proc. Natl. Acad. Sci. USA 86:3594-3598; Hatzoglu et al., (1990) J. Biol. Chem. 

20 265:17285-17293; Ferry et al. (1991) Proc. Natl. Acad. Sci. USA 88:8377-8381. Routes and 
modes of administering the vector include injection, e.g intravascularly or intramuscularly, 
inhalation, or other parenteral administration. 

Advantages of adenovirus vectors for human gene therapy include the fact that recombina- 
25 tion is rare, no human malignancies are known to be associated with such viruses, the ade- 
novirus genome is double stranded DNA which can be manipulated to accept foreign genes 
of up to 7.5 kb in size, and live adenovirus is a safe human vaccine organisms. 

Another vector which can express the DNA molecule of the present invention, and is useful 
30 in gene therapy, particularly in humans, is vaccinia virus, which can be rendered non- 
replicating (U.S. Pat. Nos. 5,225,336; 5,204,243; 5,155,020; 4,769,330). 

Based on the concept of viral mimicry, artificial viral envelopes (AVE) are designed based on 
the structure and composition of a viral membrane, such as HIV-1 or RSV and used to de- 
35 liver genes into cells in vitro and in vivo. See, for example, U.S. Pat. No. 5,252,348, Schreier 
H. et al., J. Mol. Recognit., 1995, 8:59-62; Schreier H et al., J. Biol. Chem., 1994, 269:9090- 
9098; Schreier, H. f Pharm. Acta Helv. 1994, 68:145-159; Chander, R et al. Life Sci., 1992, 
50:481-489, which references are hereby incorporated by reference in their entirety. The 
envelope is preferably produced in a two-step dialysis procedure where the "naked" enve- 
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lope is formed initially, followed by unidirectional insertion of the viral surface glycoprotein of 
interest. This process and the physical characteristics of the resulting AVE are described in 
detail by Chander et al. f (supra). Examples of AVE systems are (a) an AVE containing the 
HIV-1 surface glycoprotein gp160 (Chander et al., supra; Schreier et al., 1995, supra) or 
5 glycosyi phosphatidylinositoi (GPI)-linked gp120 (Schreier et al., 1994, supra), respectively, 
and (b) an AVE containing the respiratory syncytial virus (RSV) attachment (G) and fusion 
(F) glycoproteins (Stecenko, A. A. et aL, Pharm. Pharmacol. Lett 1:127-129 (1992)). Thus, 
vesicles are constructed which mimic the natural membranes of enveloped viruses in their 
ability to bind to and deliver materials to cells bearing corresponding surface receptors. 

10 

AVEs are used to deliver genes both by intravenous injection and by instillation in the lungs. 
For example, AVEs are manufactured to mimic RSV, exhibiting the RSV F surface glycopro- 
tein which provides selective entry into epithelial cells. F-AVE are loaded with a plasmid cod- 
ing for the gene of interest, (or a reporter gene such as CAT not present in mammalian tis- 
1 5 sue). 

The AVE system described herein in physically and chemically essentially identical to the 
natural virus yet is entirely "artificial", as it is constructed from phospholipids, cholesterol, and 
recombinant viral surface glycoproteins. Hence, there is no carry-over of viral genetic infor- 
20 mation and no danger of inadvertant viral infection. Construction of the AVEs in two inde- 
pendent steps allows for bulk production of the plain lipid envelopes which, in a separate 
second step, can then be marked with the desired viral glycoprotein, also allowing for the 
preparation of protein cocktail formulations if desired. 

25 Another delivery vehicle for use in the present invention are based on the recent description 
of attenuated Shigella as a DNA delivery system (Sizemore, D. R. et al., Science 270:299- 
302 (1995), which reference is incorporated by reference in its entirety). This approach ex- 
ploits the ability of Shigellae to enter epithelial cells and escape the phagocytic vacuole as a 
method for delivering the gene construct into the cytoplasm of the target cell. Invasion with 

30 as few as one to five bacteria can result in expression of the foreign plasmid DNA delivered 
by these bacteria. 

A preferred type of mediator of nonviral transfection in vitro and in vivo is cationlc (ammo- 
nium derivatized) lipids. These positively charged lipids form complexes with negatively 
35 charged DNA, resulting in DNA charged neutralization and compaction. The complexes en- 
docytosed upon association with the cell membrane, and the DNA somehow escapes the 
endosome, gaining access to the cytoplasm. Cationic Iipid:DNA complexes appear highly 
stable under normal conditions. Studies of the cationic lipid DOTAP suggest the complex 
dissociates when the inner layer of the cell membrane is destabilized and anionic lipids from 
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the inner layer displace DNA from the cationic lipid. Several cationic lipids are available 
commercially. Two of these, DMRI and DC-cholesterol, have been used in human clinical 
trials. First generation cationic lipids are less efficient than viral vectors. For delivery to lung, 
any inflammatory responses accompanying the liposome administration are reduced by 
5 changing the delivery mode to aerosol administration which distributes the dose more 
evenly. 

Drug screening 

Genes identified as changing in various stages of bladder cancer can be used as markers for 
10 drug screening. Thus by treating bladder cancer cells with test compounds or extracts, and 
monitoring the expression of genes identified as changing in the progression of bladder 
cancers, one can identify compounds or extracts which change expression of genes to a 
pattern which is of an earlier stage or even of normal bladder mucosa. 

15 It is also within the scope of the invention to use small molecules in drug screening. 

The following are non-limiting examples illustrating the present invention. 

EXAMPLES 

20 

Example 1 

Identification of a molecular signature defining disease progression in patients with 
superficial bladder carcinoma 

25 Patient samples 

Bladder tumor biopsies were obtained directly from surgery after removal of the necessary 
amount of tissue for routine pathology examination. The tumors were frozen at -80°C in a 
guanidinium thiocyanate solution for preservation of the RNA. Informed consent was ob- 
tained in all cases, and the protocols were approved by the scientific ethical committee of 

30 Aarhus County. The samples for the no progression group were selected by the following 
criteria: a) Ta or T1 tumors with no prior higher stage tumors; b) a minimum follow up period 
of 12 months to the most recent routine cystoscopy examination of the bladder with no oc- 
currence of tumors of higher stage. The samples for the progression group were selected by 
two criteria: a) Ta or T1 tumors with no prior higher stage tumors; b) subsequent progression 

35 to a higher stage tumor, see Table 1 . 

Table 1. Clinical data on all patients involved in the study 
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Training set 
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Test set 



Group Sample Hist. Progressed Time to Follow- 
to: progression up time 
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Delineation of non-progressing tumors from progressing tumors 

To delineate non-progressing tumors from progressing tumors we now profiled a total of 29 
5 bladder tumor samples; 13 early stage bladder tumor samples without progression (median 
follow-up time 35 months) and 16 early stage bladder tumor samples with progression (me- 
dian time to progression 7 months). See Table 1 for description of patient disease courses. 
We analyzed gene expression changes between the two groups of tumors by hybridizing the 
labeled RNA samples to customized Affymetrix GeneChips with 59,000 probe-sets to cover 
10 virtually the entire transcriptome (-95% coverage). Low expressed and non-varying probe- 
sets were eliminated from the data set and the resulting 6,647 probe-sets that showed varia- 
tion across the tumor samples were subjected to further analysis. These probe-sets repre- 
sent 5,356 unique genes (Unigene clusters). 

1 5 Gene expression similarities between tumor biopsies 

We analyzed gene expression similarities between the tumor biopsies using unsupervised 
hierarchical cluster analysis (Fig. 1). This showed a notable distinction between the non- 
progressing and the progressing tumors when using the 3,197 most varying probe-sets (s.d. 
k 75) for clustering (4 errors; % 2 test, P = 0.0001). Using other gene-sets based on different 

20 gene variation criteria demonstrated the same distinction between the tumor groups. Two of 
the samples that show later progression (825-3 and 112-2) were found in the non- 
progression branch of the cluster dendrogram and two of the non-progressing samples (815- 
1 and 150-6) were found in the progression branch. This distinct separation of the samples 
indicated a considerable biological difference between the two groups of tumors. Notably, 

25 the T1 tumors did not cluster separately from Ta tumors; however, they did form a sub- 
cluster in the progressing branch of the dendrogram. Based on this we decided to look for a 
general signature of progression disregarding pathologic staging of the tumors. 
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Selection of the 100 most significantly up-regulated genes in each group using t-test 
statistics 

We delineated the non-progressing tumors from the progressing tumors by selecting the 100 
most significantly up-regulated genes in each group using t-test statistics (Fig. 2 and Table 
5 2). Among the genes up regulated in the non-progressing group we found the SERPINB5 
and FAT tumor suppressor genes and the FGFR3 gene, which has been shown to be fre- 
quently mutated in superficial bladder tumors with low recurrence rates (van Rhijn et al. 
2001). Among the genes up regulated in the progressing group we found the PLK (Yuan et 
al. 1997), CDC25B (Galaktionov et al. 1991), CDC20 (Weinstein et al. 1994) and MCM7 

10 (Hiraiwa et al. 1997) genes, which are involved in regulating cell cycle and cell proliferation. 
Furthermore, in this group we identified the WHSC1, DD96 and GRB7 genes, which have 
been predicted/computed (Gene Ontology) to be involved in oncogenic transformation. An- 
other interesting candidate in this group is the NRG1 gene, which through interaction with 
the HER2/HER3 receptors has been found to induce differentiation of lung epithelial cells 

15 (Liu & Kern 2002). The PPARD gene was also identified as up regulated in the tumors that 
show later progression. Disruption of this gene was found to decrease tumorigenicity in colon 
cancer cells (Park et al. 2001). Furthermore, PPARD regulates VEGF expression ^bladder 
cancer cell lines (Fauconnet et al. 2002). 

20 Table 2. The 200 best markers of progression 
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4.37 


3.54 


AF064801 




MS.OoOO 


ash2 (absent, small, or homeotic, Drosophila, homolog)-like 


4.37 


3.52 


AW960782 


431857 


Hs.271742 


ADP-ribosyltransferase (NAD; poly (ADP-ribose) poly- 
merasey-like 3 


4.36 


3.52 


W19144 




MS-ozyoz 


cyclin D1 (PRAD1: parathyroid adenomatosis 1) 


4.35 


3.51 


AU077231 


AO* 700 


ns. i*fzu 


fibroblast growth factor receptor 3 (achondroplasia, thanato- 
pnonc Gwarrtsm) 


4.34 


3.50 


AL1 19671 


440197 


Me ^177111 
m>.o iff i*f 


paiiia f^mouse; nomoiog, paiiiain 


4.32 


3.49 


AW340708 




nS.»3f ZD 


x uuo protein 


4.32 


3.48 


AF1 68712 


445831 


Hs.13351 


UnC (bacterial lantiblotic synthetase component C)-Hke 1 


4.31 


3.46 


NM_00605 
5 




He OOAAQT 


nypotneticai protein Mi3U424o 


4.29 


3.45 


AW410714 


448813 


Hs.22142 


cytochrome b5 reductase b5R.2 


4.28 


3.44 


AF1 69802 


449268 


Hs.23412 


hypothetical protein FLJ20160 




A'X 


mvv ooy z i o 


429311 


Hs.1 98998 


conserved helix-loop-heiix ubiquitous kinase 


4.28 


3.42 


AF080157 


423599 


Hs.31731 


peroxiredoxin 5 


4.27 


3.41 


A1805664 


422913 


Hs.121599 


CGI-18 protein 


4.26 


3.40 


NM„01594 
7 


418127 


Hs.83532 


membrane cofactor protein (CD46, trophoblast-lymphocyte 
cross-reactive antigen) 


4.26 


3.39 


BE243982 
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425221 


Hs. 1551 88 


TATA box binding protein (TBP)-associated factor, RNA 
polymerase II, F, 55kD 


4.25 


3.38 


AV649864 


426682 


Hs.2056 


UDP glycosyltransferase 1 family, polypeptide A9 


4.23 


3.37 


AV660038 


421101 


Hs.101840 


major histocompatibility complex, class Mike sequence 


4.23 


3.37 


AF010446 


444037 


Hs.380932 


CHMP1 .5 protein 


4.22 


3.35 


AV647686 


443407 


Hs.348514 


ESTs, Moderately similar to 2109260A B cell growth factor 
[H.sapiens] 


4.21 


3.35 


AA037683 


448625 


Hs. 178470 


hypothetical protein FL J 22662 


4.21 


3.34 


AW970786 


450997 


Hs.35254 


hypothetical protein FLB6421 


4.16 


3.34 


AW580830 


444336 


Hs.10882 


HMG-box containing protein 1 


4.15 


3.33 


AF019214 


416977 


Hs.406103 


hypothetical protein FKSG44 


4.14 


3.32 


AW130242 


420613 


Hs.406637 


ESTs, Weakly similar to A47582 B-cell growth factor precur- 
sor [H.sapiens] 


4.13 


3.31 


AI873871 


414843 


Hs.77492 


heterogeneous nuclear ribonucleoprotein AO 


4.1 


3.30 


BE386038 


408288 


Hs.16886 


gb:zl73d06.r1 Stratagene colon (937204) Homo sapiens 
cDNA clone 5\ mRNA sequence 


4.09 


3.29 


AA053601 


422043 


Hs. 11 0953 


retinoic acid induced 1 


4.09 


3.29 


AL1 33649 


432864 


Hs.359682 


calpastatin 


4.08 


3.28 


D16217 


410047 


Hs.379753 


zinc finger protein 36 (KOX 18) 


4.06 


3.28 


AI167810 


400773 


- 


NM_003105*:Homo sapiens sortilinnrelated receptor, L(DLR 
class) A repeats-containing (SORL1), mRNA. 


4.06 


3.27 


- 


423960 


Hs.136309 


SH3-containing protein SH3GLB1 


4.05 


3.27 


AA164516 


449626 


Hs.1 12860 


zinc finger protein 258 


4.04 


3.27 


AA774247 


429953 


Hs.226581 


COX15 (yeast) homolog, cytochrome c oxidase assembly 
protein 


4.04 


3.24 


NM_00437 
6 


428901 


Hs. 146668 


KIAA1253 protein 


4.02 


3.24 


AI929568 


4ZU079 


ii— r\A one 

HS.9489o 


PTD011 protein 


3.99 


3.22 


NM_01405 
1 


436576 


Hs.77542 


ESTs 


3.98 


3.21 


AI458213 


412841 


Hs.101395 


hypothetical protein MGC11352 


3.97 


3.21 


AI751157 


431604 


Hs.264190 


vacuolar protein sorting 35 (yeast homolog) 


3.96 


3.21 


AF175265 


428318 


Hs.356190 


ubiquitin B 


3.96 


3.19 


BE300110 


430677 


Hs.359784 


desmoglein 2 


3.95 


3.19 


226317 


407955 


Hs.9343 


ESTs 


3.94 


3.18 


BE536739 


4261 77 


Hs. 167700 


Homo sapiens cDNA FU10174 fis, clone HEMBA1003959 


3.92 


3.17 


AA373452 


429802 


Hs.5367 


ESTs, Weakly similar to I38022 hypothetical protein 
[H.sapiens] 


3.92 


3.17 


H09548 


423810 


Hs. 132955 


BCL2/adenovirus E1B 19kD-interactlng protein 3-like 


3.92 


3.16 


AL1 32665 


421475 


Hs. 104640 


HIV-1 inducer of short transcripts binding protein; lymphoma 
related factor 


3.91 


3.15 


AF000561 


436472 




klAAflQAR nrnfoln 
i\i/a/au5>*to pruioin 


3.91 


3.14 


AL045404 


434263 


Hs.79187 


ESTs 


3.9 


3.13 


N34895 


400843 




NM_003105*:Homo sapiens sortflin-related receptor, L(DLR 
class) A repeats-containing (SORL1), mRNA. 


3.9 


3.13 




440357 


Hs.20950 


phospholysine phosphohlstidine inorganic pyrophosphate 
phosphatase 


3.89 


3.12 


AA379353 


437223 


Hs.330716 


Homo sapiens cDNA FU14368 fis. clone HEMBA1001122 


3.88 


3.12 


C15105 
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4261 25 


Hs.1 66994 


FAT tumor suppressor (Drosophila) homolog 


G.OO 


O.I 1 


ao7Z41 


432554 


Hs.278411 


NCK-associated protein 1 


3.86 


3.10 


A1479813 


422506 


Hs.300741 


sorcin 


3.85 


3.10 


R20909 


413786 


Hs. 13500 


ESTs 


3.83 


3.09 


AW613780 


429561 


i i— OCAfiilC 

HS.250b4o 


baculoviral IAP repeat-containing 6 


3.83 


o.Oo 


A COfiCCCC 

AF265555 


404977 




Insulin l!|.— 4 ,4U, £— — X— O / nAn »AlArMM>4!n A \ /|PCO\ 

insuiin-iiKe growth factor 2 (somatomedin A) (ivarz) 


3.83 


3.08 




427722 


MS. 180479 


hypothetical protein rLJZQllo 


3.82 


3.08 


AK000123 


400844 




nivmjooios :nomo sapiens sortuin-reiated receptor, L(ulk 
dass)A repeats-containing (SORL1), mRNA. 


3.82 


3.08 




A Oi5>4 CO 

4zo4by 


ns.oboUoy 


methylmalonate-sem (aldehyde dehydrogenase 


3.81 


3.07 


BE297886 


4o957o 


MS. 350547 


nuclear receptor co-repressor/HDAC3 complex subunit 


3.81 


3.06 


AW263124 


4ZOOUO 


II. A ~7f\4 74 

HS.170171 


glutamate-ammonia ligase (glutamine synthase) 


3.8 


3.06 


W23184 


448524 


Hs.21356 


hypothetical protein DKFZp762K2015 


3.79 


3.06 


AB032948 


448357 


Hs.108923 


RAB38, member RAS oncogene family 


3.79 


3.06 


N20169 


425097 


MS. 154545 


PDZ domain containing guanine nucleotide exchange fac- 
tor(GEF)1 


3.77 


3.05 


NM_01424 
7 


421649 


1 1 — A AC A <|C 

MS.106415 


peroxisome proliferative activated receptor, delta 


5.76 


5.50 


AA721217 


A 077/1 "7 

4Z7747 


MS. 180655 


serine/threonine kinase 12 


5.41 


5.03 


AW411425 


*r oyu 1 u 


J-lct TA9iA 
rlS.r 0<£lO 


Homo sapiens cDNA FLJ13713 fis, done PLACE2000398, 
moderately similar to LAR PROTEIN PRECURSOR (LEU- 
r\UuY 1 e AN I IGfcN RcLATED) (cC 3.1 .3.48) 


4.57 


4.80 


AW1 70332 




nS.oUroo 


co IS 


4.49 


4.59 


AW979008 


toou I o 


ns. 100/ u 


CO 1 S 


4.42 


4.50 


AI002106 


4*^9Q9Q 


He 1T9A1A 


neureguun 1 


4.37 


4.40 


AW954938 






1 argot exon 


4.22 


4.32 






ns. iyoy 1*+ 


minor ntsiocompatioiiity antigen MA-1 


4.2 


4.26 


AW505086 






MnMi m protein 


4.16 


4.24 


AW249934 


49R71 9 


ns. 1 yu*to^ 


waauooo gene product 


4.14 


4.19 


AW085131 




He ^«WM9 


uoiquiun earner proiein 


4.11 


4.10 


BE270447 


421595 


Hs.301685 


KIAA0620 protein 


4.1 


4.07 


AB014520 


400044 


MS.179D47 


Homo sapiens cDNA FLJ12195 fis, done MAMMA1 000865 


4.04 


4.02 


AA610175 


44o0f y 


nS.SoYO 


hypothetical protein FLJ 10948 


4.01 


4.00 


AK001810 


422959 


Hs.349256 


paired immunoglobulin-like receptor beta 


4.01 


3.98 


AV647015 


AR9fM 9 


MS.^797Bo 


kinesin family member 4 A 


3.98 


3.96 


AA307703 


4000z£U 


nS.il7oD4 


coTS 


3.97 


3.91 


AA677934 


40OO0Z 


nS.399939 


gb:nc39d05.M NCL_CGAP_Pr2 Homo sapiens cDNA done, 
mRNA sequence 


3.95 


3.88 


AA228357 


427999 


Hs.181369 


ubiquitin fusion degradation Hike 


3.94 


3.86 


AI435128 


4Z7oS1 


II— OOylOOO 

MS. 284232 


tumor necrosis factor receptor superfamily, member 12 
(translocating chaln-assodation membrane protein) 


3.93 


3.81 


AB018263 


413929 


Hs.75617 


collaoan tvne IV alnha 9 

outiayoi i f typo iv, alalia c. 


•a oq 
o.9o 


o.79 


D COO loo 9 


420116 


Hs.95231 


FH1/FH2 domain-containing protein 


3.9 


3.77 


NM_01324 
1 


433914 


Hs.112160 


Homo sapiens DNA helicase homolog (PIF1) mRNA, partial 
cds 


3.88 


3.75 


AF108138 


420732 


Hs.367762 


ESTs 


3.87 


3.74 


AA789133 


452517 




gb:RC-BT068-1 30399-068 BT068 Homo sapiens cDNA. 


3.84 


3.70 


AI904891 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 



T/DK2003/000750 



73 







mRNA sequence 








437524 


Hs.385719 


ESTs 


3.82 


3.68 


AI627565 


435158 


Hs.65588 


DAZ associated protein 1 


3.8 


3.66 


AW663317 


448780 


Hs.267749 


Human DNA sequence from done 366N23 on chromosome 
6q27. Contains two genes similar to consecutive parts of the 
C. eiegans UNC-93 {protein 1, C46F11.1) gene, a KIAA0173 
and Tubulin-Tyrosine Ugase LIKE gene, a Mitotic Feedback 
Controi Protein MADP2 H 


3.8 


3.65 


W92071 


445084 


Hs.250848 


hypothetical protein FLJ14761 


3.79 


3.64 


H38914 


423138 




gb:EST385571 MAGE resequences, MAGM Homo sapiens 
cDNA, mRNA sequence 


3.75 


3.60 


AW973426 


419602 


Hs.91521 


hypothetical protein 


3.74 


3.59 


AW248434 


442549 


Hs.8375 


TNF receptor-associated factor 4 


3.74 


3.58 


AI751601 


450893 


Hs.25625 


hypothetical protein FLJ11323 


3.73 


3.55 


AK002185 


414223 


Hs.238246 


hypothetical protein FU22479 


3.73 


3.55 


AA954566 


444312 


Hs.351142 


ESTs 


3.72 


3.53 


R44007 


425205 


Hs.155106 


receptor (calcitonin) activity modifying protein 2 


3.71 


3.51 


NM_00585 
4 


432327 


Hs.274363 


neuroglobin 


3.71 


3.49 


R36571 


451970 


Hs.211046 


ESTs 


3.67 


3.48 


AI825732 


408049 


Hs.345588 


desmoplakin (DPI, DPI I) 


3.67 


3.45 


AW076098 


440100 


Hs. 158549 


ESTs, Weakly similar to T2D3_HUMAN TRANSCRIPTION 
INITIATION FACTOR TFIID 135 KDA SUBUNIT [H.sapiens] 


3.66 


3.45 


BE382685 


426468 


Hs.1 17558 


ESTs 


3.65 


3.43 


AA379306 


402384 




NM_007181*:Homo sapiens mitogen-activated protein 
kinase kinase kinase kinase 1 (MAP4K1), mRNA. 


3.64 


3.43 




458132 


Hs. 103267 


hypothetical protein FLJ22548 similar to gene trap PAT 12 


3.64 


3.42 


AW247012 


447400 


Hs.18457 


hypothetical protein FU20315 


3.64 


3.42 




443893 


Hs.1 15472 


ESTs, Weakly similar to 2004399A chromosomal protein 
[H.sapiens] 


3.63 


3.41 




424959 


Hs. 153937 


activated p21cdc42Hs kinase 


3.62 


3.40 


NMJW578 
1 


409586 


Hs.55044 


DKFZP586H2123 protein 


3.6 


3.39 


AL050214 


445692 


Hs.1 82099 


ESTs 


3.6 


3.37 


AI248322 


433052 


Hs.293003 


ESTs, Weakly similar to PC4259 ferritin associated protein 
[H.sapiens] 


3.6 


3.36 • 


AW971983 


421782 


Hs.1 08258 


actin binding protein; macrophin (microfilament and actln 
filament cross-linker protein) 


3.59 


3.35 


AB029290 


414907 


Hs.77597 


polo (DrosophlaHIke kinase 


3.58 


3.34 


A57UI 69 


454639 




gb:RC2-ST01 58-091 099-01 1-d05 ST0158 Homo sapiens 
cDNA, mRNA sequence 


3.57 


3.33 


AW811633 


434547 


Hs.106124 


ESTs 


3.56 


3.32 


R26240 


439130 


Hs.375195 


ESTs 


3.55 


3.32 j 


AA306090 


413564 




gb:601146990F1 NIH_MGC_19 Homo sapiens cDNA clone 
5\ mRNA sequence 


3.54 


3.31 


BE260120 


443471 


Hs.398102 


Homo sapiens clone FLB3442 PRO0872 mRNA, complete 
cds 


3.53 


3.31 


AW236939 
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424415 


Hs.146580 


enolase 2, (gamma, neuronal) 


3.52 


3.30 


NM_00197 
5 


405036 




NM_021628*:Homo sapiens arachidonate lipoxygenase 3 
(ALOXE3), mRNA. VERSION NMJ)20229.1 Gl 


3.52 


3.29 




422068 


Hs.104520 


Homo sapiens cDNA FU13694 fis, done PLACE2000115 


3.52 


3.29 


A1807519 


424244 


Hs.143601 


hypothetical protein hCLA-iso 


3.52 


3.28 


AV647184 


451867 


Hs.27192 


hypothetical protein dJ1057B20.2 


3.51 


3.26 


W74157 


429187 


Hs.1 63872 


ESTs, Weakly similar to S65657 alpha-1C-adrenergic recep- 
tor splice form 2 [H.sapiens] 


3.49 


3.26 


AA447648 


415200 


Hs.78202 



SWI/SNF related, matrix associated, actin dependent regula- 
tor of chromatin, subfamily a, member 4 


3.48 


3.25 


AL040328 


405667 




Target Exon 


3.48 


3.25 




421075 


Hs.101474 


KIAA0807 protein 


3.47 


3.23 


AB018350 


424909 


Hs.1 53752 


cell division cycle 25B 


3.46 


3.22 


S78187 


451164 


Hs.60659 


ESTs, Weakly similar to T46471 hypothetical protein 
DKF2p434L0130.1 [Ksapiens] ! 


3.46 


3.21 


AA015912 


438644 


Hs.129037 


ESTs 


3.46 


3.20 


Ai 1261 62 


432258 


Hs.293039 


ESTs 


3.45 


3.19 


AW973078 


411817 


Hs.72241 


mitogen-activated protein kinase kinase 2 


3.45 


3.19 j 


BE302900 


414918 


Hs.72222 


hypothetical protein FU13459 


3.45 


3.18 


AI219207 


437256 


Hs.97871 


Homo sapiens, clone IMAGE:3845253, mRNA, partial cds 


3.43 


3.17 


AL1 37404 

nu IWI TWT 


404208 




C6001282:gi|4504223|ref|NP_000172.1| glucuronidase, beta 
[Homo sapiens] gi)1 14963|sp|P082 


3.42 


3.16 




421989 


Hs.1 10457 


Wolf-Hirschhorn syndrome candidate 1 


3.4 


3.15 


AJ007042 


438942 


Hs.6451 


PRO0659 protein 


3.39 


3.14 


AW875398 


412649 


Hs.74369 


integrin, alpha 7 


3.38 


3.14 


NM 00220 
6 


414840 


Hs.23823 


hairy/enhancer-of-split related with YRPW motif-like 


3.37 


3.13 


R27319 


434831 


Hs.273397 


KIAA0710 gene product 


3.35 


3.12 


AA248060 


431842 


Hs.271473 


epithelial protein up-regulated in carcinoma, membrane 
associated protein 17 


3.34 


3.11 


NM 00576 
4 


402328 




Target Exon 


3.34 


3.10 




405371 


_ 


NM__005569*:Homo sapiens LIM domain kinase 2 (LIMK2), 
transcript variant 2a, mRNA, 


3.33 


3.10 




441650 


Hs.1 32545 


ESTs 


3.32 


3.09 


Ai 261 960 


418629 


Hs.86859 


growth factor receptor-bound protein 7 


3.3 


3.09 


BE247550 


406002 




Target Exon 


3.3 


3.08 




420307 


Hs.66219 


ESTs 


3.29 


3.08 


AW502869 


425093 


Hs.1 54525 


KIAA1076 protein 


3.28 


3.07 


AB028999 


427351 


Hs.1 23253 


hypothetical protein FU22009 


3.28 


3.07 


AW402593 


417900 


Hs.82906 


CDC20 (cell division cycle 20, S. cerevlsiae, homolog) 


3.28 


3.06 


BE250127 


457228 


Hs.1 95471 


Human cosmld CRI-JC2015 at D10S289 in 10sp13 


3.27 


3.05 


U15177 


421026 


Hs.101067 


GCN5 (general control of amino-acid synthesis, yeast, ho- 
mo!og)-iike 2 


3.27 


3.04 


AL047332 


430746 


Hs.406256 


ESTs 


3.27 


3.03 


AW977370 


409556 


Hs.54941 


phosphoryiase kinase, alpha 2 Giver) 


3.27 


3.03 


D38616 


451225 


Hs.57655 


ESTs 


3.26 


3.03 


AI433694 
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404913 


- 


NM_024408*:Homo sapiens Notch (Drosophila) homolog 2 
(NOTCH2), mRNA. VERSION NMJ)24410.1 Gl 


3.25 


3.02 


- 






NM_022819*:Homo sapiens phospholipase A2, group IIF 

IrLAZo^P), mKIViA. VtKolOlN NM_UZUZ4o.Z ol 


3.23 


3.02 




404606 




Target Exon 


3.23 


3.01 




414732 


Hs.77152 


minichromosome maintenance deficient (S. cerevisiae) 7 


3.22 


3.01 


AW410976 


425380 


Hs.32148 


AD-015 protein 


3.22 


3.00 


AA356389 


421186 


Hs.270563 


ESTs, Moderately similar to T1 251 2 hypothetical protein 
DKFZp434G232.1 [H.sapiens] 


3.21 


2.98 


AI798039 


445462 


Hs.288649 


hypothetical protein MGC3077 


3.2 


2.97 


AA378776 



Permutation analysis of 100 most significantly up-regulated genes In each group 
By permuting the sample labels 500 times we estimated the significance of the 
5 differentially expressed genes. The permutation analysis revealed that it was highly 
unlikely to fi nd as good markers by chance, as similar godd markers were only found 
in 5% of the permutated data sets, see Table 2. 

Molecular predictor of progression 

10 A molecular predictor of progression using a combination of genes may have higher predic- 
tion accuracy than when using single marker genes. Therefore, to identify the gene-set that 
gives the best prediction results using the lowest number of genes we built a predictor using 
the "leave one out" cross-validation approach, as previously described (Golub et al. 1999). 
Selecting the 100 best genes in each cross-validation loop gave the lowest number of pre- 

15 diction errors (5 errors, 83% correct classification) in our training set consisting of the 29 
tumors (see Figure 3). As in our previous study we used a maximum likelihood classification 
approach. We selected a gene-expression signature consisting of those 45 genes that were 
present in 75% of the cross-validation loops, and these represent our optimal gene-set for 
progression prediction (Fig. 4a and Table 3). 

20 

Many of these 45 genes were also found among the 200 best markers of progression, how- 
ever, the cross-validation approach also identified other interesting markers of progression 
like BIRC5 (Survivin), an apoptosis inhibitor that is up regulated in the tumors that show later 
progression. BIRC5 has been reported to be expressed In most common cancers (Ambrosini 

25 et al. 1997). To validate the significance of the 45-gene expression signature we used a test 
set consisting of 19 early stage bladder tumors (9 tumors with no progression and 10 tumors 
with later progression). Total RNA from these samples were amplified, labeled and hybrid- 
ized to customized 60mer-oligonucleotide microarray glass slides and the relative expres- 
sions of the 45 classifier genes were measured following appropriate normalization and 

30 background adjustments of the microarray data. The independent tumor samples were clas- 
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sified as non-progressing or progressing according to the degree of correlation to the aver- 
age no progression profile from the training samples (Fig. 3b). When appiying no cutoff limits 
to the predictions the predictor identified 74% of the samples correctly. However, as done 
recently in a breast cancer study (van't Veer et al. 2002), we applied correlation cutoff limits 
of 0.1 and -0.1 in order to disregard samples with really low correlation values and in this 
way we obtained 92% correct predictions of samples with correlation values above 0.1 or 
below -0.1. Although the test-set is limited in size the performance is notable and could be of 
clinical use. 



Table 3. The 45 optimal genes for disease progression prediction. 



Eos 


. Unigene 


Description 


T-Test 

sasi - . 


5% v 

perm 


-Gene Name 


Exemplar 
Accession ^ 


CV 


439010 


Hs.75216 


protein tyrosine phosphatase, receptor 
type, F 


4.57 


4.39 


DTD DC 
K 1 rKr 


AW1 70332 


29 


429124 


Hs.1 96914 


minor histocompatibility antigen HA-1 


4.20 


4.09 


HA-1 


AW505086 


29 


421649 


Hs.106415 


peroxisome proliferative activated recep- 
tor, delta 


5.76 


5.64 


PPARD 


AA721217 


29 


433914 


Hs.1 12160 


DNA helicase homolog (PIF1) 


3.88 


3.61 


PIF1 


AF108138 


29 


429187 


Hs.163872 


ESTs, Weakly similar to hypothetical 
protein FU20489 


3.49 


3.17 




AA447648 


28 


422765 


Hs.1 578 


baculoviral IAP repeat-containing 5 
(survivin) 


2.68 


2.56 


BIRC5 


AW409701 


28 


433844 


Hs.179647 


ESTs 


4.04 


3.80 




AA610175 


26 


450893 


Hs.25625 


Hypothetical protein FU11323 


3.73 


3.46 


FU11323 


AK002185 


25 


452866 


Hs.268016 


ESTs 


3.10 


3.02 




R26969 


24 


424909 


Hs.153752 


ceil division cycle 25B 


3.46 


3.16 


CDC25B 


S78187 


24 


452929 


HS.1 7281 6 


neureguiin 1 


4.37 


4.23 


NRG1 


AW954938 


23 


420116 


Hs.95231 


formin homology 2 domain containing 1 


3.90 


3.63 


FHOD1 


NM_013241 


22 


453963 


Hs.28959 


cDNAFU36513fis, clone 
TRACH2001523 


3.44 


2.88 




AA040311 


29 


429561 


Hs.250646 


baculoviral IAP repeat-containing 6 
(apollon) 


3.83 


3.03 


BIRC6 


AF265555 


29 


418127 


Hs.83532 


membrane cofactor protein (CD46, 
trophoblast-lymphocyte cross-reactive 
antigen) 


4.26 


3.37 


MCP 


BE243982 


29 


422119 


Hs.1 11862 


KIAA0590 gene product 


2.33 


1.95 


KIAA0590 


AI277829 


29 


435521 


Hs.6361 


mitogen-actlvated protein kinase kinase 
1 1nteracting protein 1 


5.24 


4.53 


MAP2K1IP1 


W23814 


29 


409632 


Hs.55279 


serine (or cysteine) proteinase Inhibitor, 
clade B (ovalbumin), member 5 


4.89 


4.11 


SERPINB5 


W74001 


29 


452829 


Hs.63368 


ESTs 


4.95 


4.31 




AI955579 


29 


416640 


Hs.79404 


DNA segment on chromosome 4 
(unique) 234 expressed sequence 


6.03 


5.51 


D4S234E 


BE262478 


29 


425097 


Hs.1 54545 


PDZ domain containing guanine nucleo- 
tide exchange factor(GEF)1 


3.77 


3.18 


PDZ-GEF1 


NMJ)14247 


28 
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445926 


Hs.334826 


splicing factor 3b, subunit 1, 155kDa 


2.40 


2.03 


SF3B1 


AF054284 


28 


437325 


Hs.5548 


F-box and leucine-rich repeat protein 5 


2.48 


2.09 


FBXL5 


AF1 42481 


28 


448813 


Hs.22142 


cytochrome b5 reductase b5R.2 


4.28 


3.41 


LOC51700 


AF1 69802 


28 


426799 


Hs.303154 


ESTs 


4.86 


4.04 


- 


H14843 


28 


446847 


Hs.82845 


ESTs 


4.65 


3.79 




T51454 


28 


428016 


Hs.181461 


ariadne homolog, ublquitin-conjugating 
enzyme E2 binding protein, 1 (Droso- 
phlla) 


3.77 


3.15 


ARIH1 


AJ243190 


27 


418321 


Hs.84087 


KIAA0143 protein 


4.62 


3.76 


KIAA0143 


D63477 


27 


422984 


Hs.351597 


ESTs 


3.50 


2.93 




W28614 


26 


408688 


Hs.1 52925 


KIAA1268 protein 


3.52 










440357 


Hs.20950 


phospholysine phosphohistidine inor- 
aanic ovroohosohate t)hosnhata«5A 




w.Uf 


/ NPP 




ZD 


420269 


Hs.96264 


alDha thalassemia/mental retardation 
syndrome X-linked (RAD54 (S. cere- 
visiae) homolog) 


3.39 


2.85 


ATRY 






423185 


? 


ornithine decarboxylase antizyme 1 


4.61 


3.71 


UAZ1 


BE299590 


26 


443407 


Hs.348514 


clone IMAGE:4052238, mRNA, partial 
cds 


4.21 


3.32 


- 


AA037683 


25 


457329 


Hs.359682 


calpastatin 


3.59 


2.99 


CAST 


AI634860 


25 


452714 


Hs.30340 


KIAA1 165: likely ortholog of mouse 
Nedd4 WW domain-binding protein 5A 


3.62 


3.01 


KIAA1165 


AW770994 


25 


444773 


Hs. 11923 


hypothetical protein DJ167A19.1 


3.71 


3.11 


DJ167A19.1 


BE156256 


24 


418504 


Hs.85335 


ESTs 


4.59 


3.67 




BE159718 


24 


444604 


Hs.11441 


Chromosome 1 open reading frame 8 


4.69 


4.17 


C1orf8 


AW327695 


23 


410691 


Hs.65450 


reticulon 4 






RTN4 


AW239226 


23 


430604 




succinate-CoA ligase, GDP-formlng, 

Ux3ta SUDUnil 


4.61 


3.72 


SUCLG2 


AV650537 


23 


421311 


Hs.283609 


musdeblind-Iike protein MBLL39 


4.65 


3.82 


MBLL39 


N71848 


23 


439632 


Hs.334437 


hypothetical protein MGC4248 


4.29 


3.42 


MGC4248 


AW410714 


22 


417924 


Hs.82932 


cyciin D1 (PRAD1: parathyroid adeno- 
matosis 1) 


4.35 


3.49 


CCND1 


AU077231 


22 


453395 


Hs.377915 


mannosidase, alpha, class 2A, member 
1 


4.71 


3.84 


MAN2A1 


D63998 


22 



Permutation analysis of 45 genes 

Again permutation analysis revealed that for all of the 45 genes similar good markers were 
only found in 5% of the 500 permuted datasets (see Table 3). 

5 

Expression profiling of metachrone higher stage tumors 

Expression profiling of the metachrone higher stage tumors could provide important 
information on the degree of expression similarities between the primary and the secondary 
tumors. Tissues from secondary tumors were available from 14 of the patients with disease 
0 progression and these were also hybridized to the customized Asymetrix GeneChips. 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 



:T/DK2003/000750 



78 

Hierarchical cluster analysis of all tumor samples based on the 3,213 most varying probe- 
sets showed that tumors originating from the same patient in 9 of the cases clustered tightly 
together indicating a high degree of intra individual similarity in expression profiles (Fig. 5). 
Notable, one tight clustering pair of tumors was a Ta and a T2+ tumor (patient 941). It was 
5 remarkable that Ta and T1 tumors and T1 or T2+ tumors from a single individual were more 
similar than e.g. Ta tumors from two individuals. There was no correlation between presence 
and absence of the tight clustering of samples from the same patient and time interval to 
tumor progression. The tight clustering of the 9 tumor pairs probably reflects the monoclonal 
nature of many bladder tumors (Sidransky et al. 1997). A set of genomic abnormalities like 
10 chromosomal gains and losses characterize bladder tumors of different stages from single 
individuals (Primdahl et al. 2002), and such physical abnormalities could be one of the 
causes of the strong similarity of metachronous tumors. The fact that 5 of the tumor pairs 
clustered apart may be explained by an oligoclonal origin of these tumors. 

15 Customized GeneChip design, normalization and expression measures 

We used a customized Affymetrix GeneChip (Eos Hu03) designed by Eos Biotech Inc., as 
described (Eaves et al. 2002). Approximately 45,000 mRNA/EST dusters and 6,200 pre- 
dicted exons are represented by the 59,000 probesets on Eos Hu03 array. Data were nor- 
malized using protocols and software developed at Eos Biotechnology, Inc. (WO0079465). 

20 An "average intensity" (Al) for each probeset was calculated by taking the trimean of probe 
intensities following background subtraction and normalization to a gamma distribution (Tur- 
key 1977). 

cRNA preparation, array hybridization and scanning 
25 Preparation of cRNA from total RNA and subsequent hybridization and scanning of the cus- 
tomized GeneChip microarrays (Eos Hu03) were performed as described previousley 
(Dyrskjot et al. 2003). 

Custom oligonucleotide microarray procedures 

30 Three 60mer oligonucleotides were designed for each of the 45 genes using Array Designer 
2.0. All steps in the customized oligonucleotide microarray analysis were performed essen- 
tially as described (Kruhoffer et al.) Each of the probes was spotted in duplicates and all 
hybridisations were carried out twice. The samples were labelled with Cy3 and a common 
reference pool was labelled with Cy5. The reference pool was made by pooling of cRNA 

35 generated from investigated samples and from universal human RNA. Following scanning of 
the glass slides the fluorescent intensities were quantified and background adjusted using 
SPOT 2.0 (Jain et al. 2002). Data were subsequently normalized using a LOWESS normali- 
sation procedure implemented in the SMA package to R. To select the best oligonucleotide 
probe for each of the 45 genes, 13 of the samples from the training set were re-analysed on 
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the custom oligonucleotide microarray platform and the obtained expression ratios were 
compared to the expression levels from the Affymetrix GeneChips. The oligonucleotide 
probes with the highest correlation to the Affymetrix GeneChip probes were selected. 

5 Expression data analysis 

Before analysing the expression data from the Eos Hu03 GeneChips control probes were 
removed and only probes with Al levels above 100 in at least 8 experiments and with 
max/min equal to or above 1.6 were selected. This filtering generated a gene-set consisting 
of 6,647 probes for further analysis. Average linkage hierarchical cluster analysis of the tu- 

10 mour samples was carried out using a modified Pearson correlation as similarity metric 
(Eisen et al. 1998). Genes and arrays were median centered and normalised to the magni- 
tude of 1 before clustering. We used the GeneCluster 2.0 software for the supervised selec- 
tion of markers and for performing permutation tests. The 45 genes for predicting progres- 
sion were selected by t-test statistics and cross-validation performance as previously de- 

15 scribed (Dyrskjot et al. 2003) and independent samples were classified according to the cor- 
relation to the average no progression signature profile of the 45 genes. 

EXAMPLE 2 

Identifying distinct classes of bladder carcinoma using microarrays 

20 

Patient disease course information - class discovery 

We selected tumours from the entire spectrum of bladder carcinoma for expression profiling 
in order to discover the molecular classes of the disease. The tumours analysed are listed in 
Table 4 below together with the available patient disease course information. 

25 



Table 4 Disease course information of all patients involved- class discovery. 



Group, 


patient 


Previous tun^URsr 

■i.-f.-i^" • 


Tumour e^n^eaon arra# 


Pattern 


Reviewed 
histology;; * 


Sybs6quent tumours- 

■^V> • • V - 


Carcinoma jf^ 


A 


709-1 




Tagr 2 (200297) 


Papillary 


Tagr3 




no 


968-1 




Tagr 2 (011098) 


Papillary 


+ 


Tagr2 (150101) 


no 


934-1 




Tagr 2 (220798) 


Papillary 


+ 




no 


928-1 




Tagr 2 (240698) 


Papillary 


+ 




no 


930-1 




Ta gr 2 (300698) 


Papillary 


+ 




no 


B 


989-1 




Tagr 3 (281 098) 


Papillary 


+ 




no 


1264-1 




Tagr 3 (130600) 


Papillary 


+ 


Tagr 2 (231000) 
Tagr 2 (220101) 
Tagr 2 (300401) 


no 


876-5 


Ta gr 2 (230398) 
Tagr 2 (271098) 
Tagr 2 (090699) 
Tagr 2 (011199) 


Tagr 3 (170400) 


Papillary 


+ 




no 
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669-7 


Tagr 2 (101296) 
Ta gr 2 (150897) 
Tagrl (161297) 
Ta gr 3 (270498) 
Ta gr 2 (220299) 


Tagr 3 (230899) 


Papillary 


Tagr2 


Ta gr 2 (1201 00) 
Tagr 2 (250500) 
Tagr 2 (250900) 
Tagr 2 (050201) 


no 




716-2 


Tagr 2 (070397) 


Tagr 3 (230497) 


Papillary 


+ 


Tagr 2 (040697) 
Tagrl (170698) 


no 


c 


1070-1 




Tagr 3 (150399) 


Papillary 


+ 


Tagr 3 (291099) 


Subsequent visit 




956-2 




Tagr 3 (061299) 


Papillary 


+ 


Tagr 3 (061200) 


Sampling visit 




1062-2 




Tagr 3 (120799) 


Papillary 


+ 


T1 gr 3 (161199) 


Sampling visit 




1166-1 




Tagr 3 (271099) 


Papillary 


+ 




Sampling visit 




1330-1 




Tagr 3 (311000) 


Papillary 


+ 




Sampling visit 


D 


112-10 


Tagr 2 (070794) 
Tagr 3 (01 1294) 
T1 gr 3(150695) 
Tagr 3 (121095) 
T1 gr 3(040396) 
Tagr 2 (200896) 
Tagr 2 (11 1296) 
Tagr 2 (230497) 
Ta gr 2 (030997) 


Tagr 3 (060198) 


Papillary 


+ 


Tagr 3 (110698) 
T1 gr 3 (191098) 
Ta gr 3 (240299) 
T1 gr 3 (050799) 
T1 gr 3 (081199) 
T1 gr 3 (180400) 


Previous visit 




320-7 


T1 gr 3 (011194) 
T1 gr 3 (150896) 
Tagr 3 (100897) 


Tagr 3 (290997) 


Papillary 


+ 


Tagr 3 (290198) 
Tagr 3 (290698) 


Sampling visit 




747-7 


Tagr 2 (010597) 
Tagr 2 (220597) 
Tagr 2 (230997) 
Tagr 2 (260198) 
T1 gr 3 (270498) 
Tagr 2 (170898) 


Tagr 3 (161298) 


Papillary 


+ 


Tagr 2 (050599) 
Tagr 2 (280999) 
Tagr 2 (141299) 


Sampling visit 




967-3 


T1 gr 3 (280998) 
T1 gr 3 (250199) 


Tagr 3 (140699) 


Papillary 


+ 


T1 gr 3 (080999) 


Sampling visit 


E 


625-1 




T1 gr 3 (200996) 


Papillary 


+ 




No 




847-1 




T1 gr 3 (210198) 


Papillary 


+ 




No 




1257-1 




T1 gr 3 (240500) 


Solid 






Sampling visit 




919-1 




T1 gr 3 (220698) ; 


Papillary 


+ 




No 




880-1 




T1 gr 3 (300398) 


Papillary 


+ 


Tagr 2 (091 198) 
Ta gr 1 (090399) 
Tagr 2 (050900) 
Tagr 2 (190301) 


No 




812-1 




T1 gr 3 (061098) 


Papillary 


+ 




No 




1269-1 




T1 gr 3 (230600) 


Papillary 






No 




1083-2 


Tagr 2 (280499) 


T1 gr 3 (120599) 


Papillary 






No | 




1238-1 




T1 gr 3 (020500) 


Papillary 




T2gr 3 (211100) 
Tagr 2 (211100) 


No 




1065-1 




T1 gr 3 (160399) 


Papillary 






Subsequent visit 




1134-1 




T1 gr 3 (181099) 


Papillary 


T2gr3 


T1 gr 3 (280200) 


Sampling visit 
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T1 gr 3 (020500) 
T1 gr 3 (131100) 




F 


1164-1 




T2+ gr 4 (101299) 


Solid 


gr3 




No 




1032-1 




T2+ gr 7(050199) 


Mixed 


- 




Not measured 




1117-1 




T2+ gr 3 (010999) 


Solid 


+ 




Sampling visit 




1178-1 




T2+gr 3 (200100) 


Solid 


+ 




Not measured 




1078-1 




T2+ gr 3 (120499) 


Solid 


+ 




Not measured 




875-1 




T2+ gr 3 (180398) 


Solid 


+ 




No 




1044-1 




T2+ gr 3 (010299) 


Solid 


+ 


T2+ gr 3 (060999) 


Not measured 




1133-1 




T2+ gr 3 (081099) 


Solid 


+ 




Not measured 




1068-1 




T2+ gr 3 (220399) 


Solid 


+ 




No 




937-1 




T2+ gr 3 (280798) 


Solid 






Not measured 



Group A: Ta gr2 tumours - no recurrence within 2 years. 

Group B: Ta gr3 tumours - no prior T1 tumour and no carcinoma in situ in random biopsies. 
Group C: Ta gr3 tumours - no prior T1 tumour but carcinoma in situ in random biopsies. 
5 Group D: Ta gr3 tumours - a prior T1 tumour and carcinoma in situ in random biopsies. 
Group E: T1 gr3 tumours - no prior T2+ tumour. Group F: T2+ tumours gr3/4 - only primary 
tumours. 

* Carcinoma in situ detected in selected site biopsies at previous, sampling or subsequent 
visits. 

10 

Two-way hierarchical cluster analysis of tumor samples 

A two-way hierarchical cluster analysis of the tumour samples based on the 1767 gene-set 
(see class discovery using hierachical clustering) remarkably separated all 40 tumours ac- 
cording to conventional pathological stages and grades with only few exceptions (Fig. 6a). 

15 We identified two main branches containing the superficial Ta tumours, and the invasive T1 
and T2+ tumours. In the superficial branch two sub-clusters of tumours could be identified, 
one holding 8 tumours that had frequent recurrences and one holding 3 out of the five Ta 
grade 2 tumours with no recurrences. In the invasive branch, it was notable that four Ta 
grade 3 tumours clustered tightly with the muscle invasive T2+ tumours. These four Ta tu- 

20 mours, from patients with no previous tumour history, showed concomitant CIS in the sur- 
rounding mucosa, indicating that this sub-fraction of Ta tumours has some of the more ag- 
gressive features found in muscle invasive tumours. The stage T1 cluster could be sepa- 
rated into three sub-clusters with no clear clinical difference. The one stage T1 grade 3 tu- 
mour that clustered with the stage T2+ muscle invasive tumours was the only T1 tumour that 

25 showed a solid growth pattern, all others showing papillary growth. Nine out of ten T2+ tu- 
mours were found in one single cluster. The remarkable distinct separation of the tumour 
groups according to stage, with practically no overlap between groups, was also demon- 
strated by multidimensional scaling analysis (Fig. 6c). 
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In an attempt to reduce the number of genes needed for class prediction we Identified those 
genes that were scored by the Cancer Genome Anatomy Project (at NCI) as belonging to 
cancer-related groups such as tumour suppressors, oncogenes, cell cycle, etc. These genes 
were then selected from the initial 1767 gene-set, and those 88 which showed largest varia- 
tion (SD of the gene vector >=4), were used for hierarchical clustering of the tumour sam- 
ples. The obtained clusters was almost identical to the 1767 gene-set cluster dendrogram 
(Fig. 6b), indicating that the tumour clustering does not simply reflect larger amounts of 
stromal components in the invasive tumour biopsies. 

The clustering of the 1767 genes revealed several characteristic profiles in which there was 
a distinct difference between the tumour groups (Fig. 6d; black lines identifying clusters a to 
j). 



Cluster a, shows a high expression level in all the Ta grade 3 tumours (Fig. 7a) and, as a 
novel finding, contains genes encoding 8 transcription factors as well as other nuclear genes 
related to transcriptional activity. Cluster c contains genes that are up-regulated in both Ta 
grade 3 with high recurrence rate and CIS, in T2+ and some T1 tumours. This cluster shows 
a remarkable tight co-regulation of genes related to cell cycle control and mitosis (Fig. 7c). 
Genes encoding cyclins, PCNA as well as a number of centromere related proteins are pre- 
sent in this cluster. They indicate increased cellular proliferation and may form new targets 
for small molecule therapy (Seymour 1999). Cluster f shows a tight cluster of genes related 
to keratinisation (Fig. 7f). Two tumours (875-1 and 1178-1) had a very high expression of 
these genes and a re-evaluation of the pathology slides revealed that these were the only 
two samples to show squamous metaplasia. Thus, activation of this cluster of genes pro- 
motes the squamous metaplasia not infrequently seen by light microscopy in invasive blad- 
der tumours. The genes in this cluster is listed in Table 5. 



Table 5 Genes for classifying samples with squamous metaplasia 



Chip acc. # 


UniGene Build 162 


description 


D83657_at 


Hs.19413 " 


NMJ505621 ; S1 00 calcium-binding protein A1 2 


HG3945-HT4215_at 






J00124_at " 






L05187_at 






L05188JLat 


Hs.505327 




L10343_at 


Hs.1 12341 


NM_002638; skin-derived protease inhibitor 3 preproprotein 


L42583_f_at 


Hs.367762 


NM_005554; keratin 6A 


L42601J_at 


Hs.367762 


NM.005554; keratin 6A 


L42611Jrjat 


Hs.446417 


NM_1 73086; keratin 6 isoform K6e 


M19888_at 


Hs.1076 


NMJ503125; small proline-rich protein 1B (comifin) 


M20030_f_at " 


Hs.505352 




M21005_at 
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He *\(\R197 






no.^ 1 


t\\t\A nflR^i ft* email nrnli no.rirh nmtaln Of* 
INlVi UUQU I O t oil Jail piUilllCPl lUI piUlBlll £\j 


M86757_s_at 


Hs.1 12408 


NM_002963; S100 calcium-binding protein A7 


S72493_s_at 


Hs.432448 


NM_005557; keratin 16 


U70981_at 


Hs.336046 


NM_000640; interleukin 13 receptor, alpha 2 precursor 


V01516J_at 


Hs.367762 


NM_005554; keratin 6A 


X53065_f_at 






X57766_at 


Hs. 143751 


NM_005940; matrix metalloproteinase 1 1 preproproteln 


Z19574_ma1_at 







Cluster g contains genes that are up-regulated in T2+ tumours and in the Ta grade 3 tu- 
mours with CIS that cluster in the invasive branch (Fig. 7g). This cluster contains genes re- 
lated to angiogenesis and connective tissue such as laminin, myosin, caldesmon, collagen, 
dystrophin, fibronectin, and endoglin. The increased transcription of these genes may indi- 
cate a profound remodelling of the stroma that could reflect signalling from the tumour cells, 
from infiltrating lymphocytes, or both. Some of these may also form new drug targets (Fox et 
al. 2001). It is remarkable that these genes are those that most clearly separate the Ta grade 
3 tumours surrounded by CIS from all other Ta grade 3 tumours. The presence of adjacent 
CIS is usually diagnosed by taking a set of eight biopsies from different places in the bladder 
mucosa. However, the present data clearly indicate that analysis of stroma remodelling 
genes in the Ta tumours could eliminate this invasive procedure. 

The clusters b, d, e, h, i, and j contain genes related to nuclear proteins, cell adhesion, 
growth factors, stromal proteins, immune system, and proteases, respectively (see Figure 8). 
A summary of the stage related gene expression is shown in Table 6. 

Table 6 

Table 6# Summary of stage related gene expression 



Functional gene clusters 3 



Tumour stage 


Transcription 


Nuclear 


Proliferation 


Matrix re- 


Extracellular 


immune 






processes 




modelling 


matrix 


system 


Tagr2 


t 








U 


i 


Tagr3 


ttt 


tt 


tt 




W 




T1 gr3 


1* 




tt b 




I 




T2gr3 


t 




ttt 


ttt 


t 


t 


Ta gr3 + CIS 


ttt 


tt 


ttt 


ttt 


t 


t 



a For a detailed description of gene clusters see Fig. 8. 



b An increase in gene expression was only found in about half of the samples analysed. 



10 



15 
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Class prediction of bladder tumours 

An objective class prediction of bladder tumours based on a iimited gene-set is clinically 
5 usefull. We therefore built a classifier using tumours correctly separated in the three main 
groups as identified in the cluster dendrogram (Fig. 6a). We used a maximum likelihood 
classification method with a "leave one out" cross-validation scheme (Shipp et al. 2002; van't 
Veer et al. 2002) in which one test tumour was removed from the set, and a set of predictive 
genes was selected from the remaining tumour samples for classifying the test tumour. This 
10 process was repeated for all tumours. Predictive genes that showed the largest possible 
separation of the three groups were selected for classification, and each tumour was classi- 
fied according to how close it was to the mean of the three groups (Fig. 8a). 

Classification of samples 

15 From the hierarchical cluster analysis of the samples (class discovery) we identified three 
major "molecular classes" of bladder carcinoma highly associated with the pathologic staging 
of the samples. Based on this finding we decided to build a molecular classifier that assigns 
tumours to these three "molecular classes". To build the classifier, we only used the tumours 
in which there was a correlation between the "molecular class" and the associated pathologic 

20 stage. Consequently, a T1 tumour clustering in the "molecular class" of T2 tumours was not 
used to build the classifier. 

The genes used in the classifier were those genes with the highest values of the ratio (B/W) 
of the variation between the groups to the variation within the groups. High values of the ratio 

25 (B/W) signify genes with good group separation performance. We calculated the sum over 
the genes of the squared distance from the sample value to the group mean and classified 
the sample as belonging to the group where the distance to the group mean was smallest. If 
the relative difference between the distance to the closest and the second closest group 
compared to the distance to the closest group were below 5%, the classification failed and 

30 the sample was classified as belonging to both groups. The relative difference is refered to 
as the classifier strength. 

Classifier performance 

The classifier performance was tested using from 1-160 genes in cross-validation loops. 
35 Figure 9 shows that the closest correlation to histopathology is obtained in the cross- 
validation model using from 69-97 genes. Based on this we chose the model using 80 genes 
for cross-validation as our final classifier model. 

Classifier model using 71 genes 
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We selected those genes for our final classifier model that were used in at least 75% (25 
times) of the cross-validation loops. These 71 genes are listed in table 7. 



Table 7 Feature: Accession number on HuGene fl array. Number: Number of times used in 
5 the 80 genes cross validation loops. Test (B/W): see below. 





Unlgene 
Build 162 


Description ) . %r^ : 


■ -nc 

, Number 

p . . * 


Test 


AF000231_at 


Hs.75618 


NM_004663; Ras-related protein Rab-11A 


33 


26.77 


D13666_s_at 


Hs.136348 


NMJJ06475; osteoblast specific factor 2 (fasciclln Mike) 


33 


27.71 


D49372_s_at 


Hs.54460 i 


NM_002986; small Inducible cytokine A11 precursor 


31 


25.78 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 precursor 


33 


31.18 


D86479_at 


Hs.439463 


NM_001129; adipocyte enhancer binding protein 1 precursor 


33 


28.29 


D89077_at 


Hs.75367 


NMJJ06748; Src-like-adaptor 


33 


30.03 


D89377_at 


Hs.89404 


NM_002449; msh homeo box homolog 2 


33 


51.50 


HG4069-HT4339_s_at 






27 


25.06 


HG67-HT67JLat 






33 


27.81 


HG907-HT907_at 






33 


25.76 




Hs.436317 


NMJ500779; cytochrome P450, family 4, subfamily B, poly- 
peptide 1 


33 


32.61 


J 03278 at 


Hs.307783 


NMJD02609; platelet-derived growth factor receptor beta 
precursor 


33 


28.02 


j 04058 at 


Hs.169919 


NM_000126; electron transfer flavoproteln, alpha polypep- 
tide 


oo 


29.4o 


J05032_at 


Hs.32393 


NM_001349; aspartyMRNA synthetase 


33 


38.21 


J05070_at 


Hs. 151 738 


NMJJ04994; matrix metailoprotelnase 9 preproprotein 


33 


35.34 


J05448_at 


Hs.79402 


NM_002694; DNA directed RNA polymerase II polypeptide 
C NM_032940; DNA directed RNA polymerase II polypep- 
tide C 


32 


26.51 


K01396_at 


Hs.297681 


NMJX)0295; serine (or cysteine) proteinase inhibitor, clade 
A (alpha-1 antiprotelnase, antitrypsin), member 1 


oo 


op cc 


L13720_at 


Hs.437710 


NM_000820; growth arrest-specific 6 


33 


29.69 


M12125_at 


Hs.300772 


NM_003289; tropomyosin 2 (beta) 


28 


24.89 


M15395_at 


Hs.375957 


NM_000211; Integrin beta chain, beta 2 precursor 


33 


29.40 


M16591_s_at 


Hs.89555 


NM.0021 10; hemopoietic cell kinase Isoform p61HCK 


33 


32.34 


M20530_at 






33 


30.28 


M23178_s_at 


Hs.73817 


NMJ)02983; chemokine (C-C motif) llgand 3 


33 


35.36 


M32011_at 


Hs.949 


NM_000433; neutrophil cytosollc factor 2 


33 


41.88 


M33195_at 


Hs.433300 


NM_004106; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


33 


30.40 


M55998_s_at 


Hs.172928 


NM_000088; alpha 1 type I collagen preproprotein 


33 


26.83 


M57731_s_at 


Hs.75765 


NM_002089; chemokine (C-X-C motif) ligand 2 


33 


31.84 


M68840_at 


Hs.183109 


NM_000240; monoamine oxidase A 


33 


32.39 


M69203_s_at 


Hs.75703 


NM_002984; chemokine (C-C motif) ligand 4 precursor 


33 


36.21 


M72885__rna1_s_at 






33 


27.94 
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M83822_at 


Hs.209846 


iNwi__yuo/ ^o, L-no-reofJuiibive vesicie iraTncKiny , ueacn ana 
anchor containing 


33 


26.44 


S77393 at 

\J 9 § W WW O » 


Hs 145754 


MM H*l R531 • Wninnof-litro fartnr 1 fKoolo\ 

inivi_u i Dww i, iMuppei-HKo racior o iDasicj 


OO 


49.85 


U01833_at 


Hs 81469 


KIM C\C\0 AtKA' ni ir*lontlHA KlnHInn nrntotn 4 /IVilinl^ knmnlnn r~ 

iNiyi^uu^Ho**, nucieuuuw uinuing proiein i (ivunu nomoiog, e. 
coll) 


33 


30.62 


U07231_at 


Hs 309763 

I iw.wwwf WW 


inivi__wu^.l/o^, o-iiwii r\.j>ir\ atstjumiuo winding Tacior i 


OO 


I OO 4 A 

09.10 


U09937_rna1_s_at 






i 1*5 
OO 


OA OO 


U10550_at 


Hs.79022 


NM_005261; GTP-binding mitogen-induced T-cell protein 
NIVM81702; GTP-binding mitogen-induced T-cell protein 


28 


25.26 


U20158_at 


Hs.2488 


MM fin55fi5* h/mnHrw*\/te f"*\/toc«Hr« nmtaln O 

v^mi uvvjjuj, iyiii|jiiuwyio wyiUwUiiv proiem ^ 


OO 


32.41 


U41 3 1 5_ma1__s_at 






OO 

OO 


43.56 


U47414_at 


Hs.13291 


NM„004354; cyclin G2 


33 


44.42 


U49352 at 


414754 
no.*T it / OH* 


i\ivj_wu i ooy, z,4-oienoyi ooa reductase i precursor 


33 


37.04 


U50708 at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase E1, 
oeia poiypepuae precursor rsiivi^ i 00050; Drancned chain 
keto acid dehydrogenase E1, beta polypeptide precursor 


33 


42.89 


U52101 at 


qqqq 


iniwm uui4zo, epitneiiai memorane protein 3 


33 


29.86 


U64520_at 


Hs 66708 

1 1 W • WW f ww 


iNivi^uiKf r on, vesicie-associatea memorane protein 3 (cellu- 

Ul OVll | y 


33 


30.17 


U65093_at 


Hs.82071 


iNivi^uuou/w, oop/pouu-inieraciing transacuvator, win 
Glu/Asp-rich carboxy-terminal domain, 2 


33 


32.07 


U68019_at 


Hs 288261 


inivi_uuoou^, iviMu, moiners against oecapentapiegic ho- 

1 1 luiuy w 


31 


26.70 


U68385_at 


Hs 380923 

i iw« w ww w^w 




33 


31.56 


U74324_at 


Hs 90875 

1 1 w > w W W f w 


MM nf)?ft71 ■ RARJntora^4?nn fsMnr " 

iMivi_wu^of i , rsMD-inieraCung racior 


33 


30.26 


U77970_at 


Hs 321164 


inivi — uv/^o io, neuronal r/\o aomain proiein ^ INivi_jD*32235l 


33 


50.37 


U90549_at 


Hs.236774 


inivi__uuwooo, nign rnoDiiiiy group nucieosomai Oindfng do- 
main 4 


33 


32.16 


X04085_rna1_at 






28 


25.13 


X07743_at 


Hs.77436 


NM_002664; pleckstrin 


33 


28.13 


X13334_at 


H«3 7*5627 
no* * jo&/ 


i\ivi__uuuoyi , L»ui4 antigen precursor 


33 


35.79 


X14046_at 


Hs 15°.n5** 
no. i wwwwo 


inivi^uu i # t*¥ t \u\jQi an u gen 


30 


24.70 


X15880 at 


Uq Ai KQQ7 

no.** low/ 


iNivi_uun o4w, collagen, type VI, alpha 1 precursor 


33 


31.51 


X15882_at 


Hs.420269 


iMivi_uuio49, alpha 2 type VI collagen fsoform 2C2 precursor 

MM ORR174' alnho O k/na \/l ^llonnn i _ — .f — . os**Om 

i"Mivi_yoo i # **, aipna z type vi coiiagen isotorm 2C2a precur- 
oui inivi^uoo i / 9, dipiid e. type vi ooiiagen isoTonm ^o^a 
precursor 


33 


32.32 


X51408_at 


Hs.380138 


NMJ01822; chimerin (chimaerin) 1 


33 


30.51 


X53800_s_at 


Hs. 89690 


imivi^uuausu, wi iwinoMne ^wt-a-o moiiTy ugano o 


33 


33.63 


X54489jma1_at 






33 


33.57 


X57579_s_at 






33 


41.43 


X64072_s_at 


Hs.375957 


NM_000211; integrin beta chain, beta 2 precursor 


33 


43.21 


X67491JLat 


Hs.355697 


NM_005271; glutamate dehydrogenase 1 


33 


30.97 


X68194_at 


Hs.80919 


NMJJ06754; synaptophysin-Iike protein Isoform a 
NM_182715; synaptophysln-like protein Isoform b 


33 


46.53 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


33 


53.16 


X78520_at 


Hs.372528 


NM_001 829; chloride channel 3 


33 


47.38 
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Y00787_s_at 


Hs.624 


NM_000584; interleukin 8 precursor 


32 


27.54 


7i91*7** at 
I Z. I f O^eH 


MS.o«545o4 


NM_002076; glucosamine (N-acetyl)-6-sulfatase precursor 


30 


25.44 


Z19554_s_at 


HS.435800 


NM_003380; vimentin 


27 


24.59 


Z26491_s_at 


rlS.240013 


NM_000754; catechol-O-methyltransferase isofbrm MB- 
COMT NM_007310; catechol-O-methyltransferase Isofbrm 
S-COMT 


32 


26.92 


Z29331_at 


Hs.372758 


NM 003344* ublfluitin-nnnlnnatinn pn7\/mA F9H 1-Qnfnrm 1 
NM_1 82697; ubiquitin-conjugatlng enzyme E2H isofbrm 2 


33 


33.49 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 
NM_176865; NM_176866; inorganic pyrophosphatase 2 
isofbrm 3 NM_1 76867; Inorganic pyrophosphatase 2 isoform 
4 NM_1 76869; inorganic pyrophosphatase 2 Isoform 1 


33 


44.45 


Z74615__at 


Hs. 172928 


NMJ)00088; alpha 1 type I collagen preproprotein 


33 


55.18 



Test for significance of classifier 
5 To test the class separation performance of the 71 selected genes we compared the B/W 
ratios with the similar ratios of all the genes calculated from permutations of the arrays. For 
each permutation we construct three pseudogroups, pseudo-Ta, pseudo-T1, and pseudo-T2, 
so that the proportion of samples from the three original groups is approximately the same in 
the three pseudogroups. We then calculate the ratio of the variation between the 
10 psudogroups to the variation within the pseudogroups for all the genes. For 500 
permutations we only two times had one gene for which the B/W value was higher than the 
lowest value for the original B/W values of the 71 selected genes (the two values being 
25.28 and 25.93). 

The classifier performance was tested using from 1-160 genes in cross-validation loops, and 
15 a model using an 80 gene cross-validation scheme showed the best correlation to pathologic 
staging (p<10 -9 ). The 71 genes that were used in at least 75% of the cross validation loops 
were selected to constitute our final classifier model. See the expression profiles of the 71 
genes in Figure 10. The genes are clustered to obtain a better overview of similar expression 
patterns. From this it is obvious that the T1 stage is characterised by having expression pat- 
20 terns in common with either Ta or T2 tumours. There are no single genes that can be used 
as a T1 marker. 

Permutation analysis 

To test the class separation performance of the 71 selected genes we compared their per- 
25 formance to those of a permutated set of pseudo-Ta, T1 and T2 tumours. In 500 permuta- 
tions we only detected two genes with a performance equal to the poorest performing classi- 
fying genes. 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




T/DK2003/000750 



88 

Classification using 80 predictive genes and other gene-sets 

The classification using 80 predictive genes in cross-validation loops identified the Ta group 
with no surrounding CIS and no previous tumor or no previous tumor of a higher stage (Ta- 
ble 8). Interestingly, the Ta tumours surrounded by CIS that were classified as T2 or T1 
5 clearly demonstrate the potential of the classification method for identifying surrounding CIS 
in a non-invasive way, thereby supplementing clinical and pathologic information. 



Table 8 



Table 



8 • Clinical data on disease courses and results of molecular classification 

Tumours" 



Patient 




mm 



Previous 
tumours 



Tumour 
analysed 



Subsequent 
tumours 



Carcinoma 
insittf 



Reviewed 
histology 6 



Molecular classifier 1 
320 80 20 



Ta grade II tumours - no progression 



709-1 




Ta gr2 


No 


Tagr3 


Ta 


Ta 


Ta 


968-1 




Ta gr2 1 Ta 


No 




Tam 


Ta 


Ta 


934-1 




Tagr2 


No 




T1 


Ta 


Ta 


928-1 




Tagr2 


No 




Ta 


Ta 


T1 


930-1 




Ta gr2 


No 




Ta 


Ta 


Ta 


Ta grade III tumours - 


no prior T1 tumour or CIS 












989-1 




Tagr3 


No 




Ta 


Ta 


Ta 


1264-1 




Ta gr3 3 Ta 


No 




Ta 


Ta 


Ta 


876-6 


4Ta 


Ta gr3 


No 




Ta 


Ta 


Ta 


669-7 


5Ta 


Ta gr3 4 Ta 


No 


Tagr2 


Ta 


Ta 


Ta 


716-2 


1Ta 


Ta gr3 2 Ta 


No 




Ta 


Ta 


Ta 


Ta grade III tumours - 


no prior T1 tumour but CIS In selected site biopsies 










1070-1 




Ta gr3 1 Ta 


Subsequent visit 




Ta 


Ta 


Ta 


956-2 




Ta gr3 1 Ta 


Sampling visit 




T2 


T2 


T2/T1 


1062-2 




Ta gr3 1 T1 


Sampling visit 




T2/Ta 


T1/Ta 


Ta 


1166-1 




Tagr3 


Sampling visit 




Tam 


Ta 


Ta 


1330-1 




Tagr3 


Sampling visit 




T2 


12 


Ta 


Ta grade 


111 tumours - 


a prior T1 tumour and CIS in selected site biopsies 










747-7 


5Ta, 1T1 


Ta gr3 3 Ta 


Sampling visit 




Ta 


Ta 


Ta 


112-10 


7Ta,2T1 


Tagr3 2Ta,4T1 


Previous visit 




Ta 


Ta 


Ta 


320-7 


1 Ta.2T1 


Tagr3 2Ta 


Sampling visit 




T2 


T2 


Ta 


967-3 


2T1 


Ta gr3 1 T1 


Sampling visit 




Ta 


Ta 


Ta 


T1 grade III tumours - 


no prior muscle Invasive tumour 












625-1 




T1 gr3 


No 




T1 


T1 


T1 


847-1 




T1 gr3 


No 




T1 


T1 


T1 


1257-1 




T1 gr3 


Sampling visit 




T1 


T1 


T1 


919-1 




T1 gr3 


No 




T1 


T1 


T1 


880-1 




T1gr3 4Ta 


No 




T1 


T1 


T1 


812-1 




T1 gr3 


No 




T1 


T1 


T1 


1269-1 




T1 gr3 


No 


No review 


T1 


T1 


T1 


1083-2 


1Ta 


T1 gr3 


No 


No review 


T1 


T1 


T1 


1238-1 




T1 gr3 1 Ta, 1 T2+ 


No 




T1 


T1 


T1 


1065-1 




T1 gr3 


Subsequent visit 


No review 


T1 


T1 


T1 


1134-1 




T1gr3 3T1 


Sampling visit 


T2gr3 


T1 


T1 


T1 


T2+ grade lll/IV tumours - only primary tumours 












1164-1 




T2+gr4 


No 


T2+ gr3 


T2/T1 


T1 


T1 


1032-1 




T2+gr? 


ND 


No review 


T2 


T2 


T2 


1117-1 




T2+gr3 


ND 




T2 


T2 


T1 


1178-1 




T2+gr3 


ND 




T2 


T2 


12 
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1078-1 


T2+ gr3 




ND 




T2 


T2 


T2 


875-1 


T2+gr3 




No 




T2 


T2 


T2 


1044-1 


T2+ gr3 


1T2+ 


ND 




T2 


T2 


T2 


1133-1 


T2+gr3 




ND 




T2 


T2 


T2 


1068-1 


T2+ gr3 




No 




T2 


T2 


T2 


937-1 


T2+ gr3 




ND 


No review 


T1 


T1 


T1 



a Examples of tumour histology. 

b Carcinoma in situ detected in selected site biopsies at the time of sampling tumour tissue 
for the arrays or at previous or subsequent visits. 

c All tumours were reviewed by a single uro-pathologist and any change compared to the 
5 routine classification is listed. 

d Molecular classification based on 320, 80, and 20 genes cross-validation loops. 

Classification using other gene-sets 

Classification was also carried out using other gene-sets (10, 20, 32, 40, 80, 160, and 320 
10 genes). These gene-sets demonstrated the same classification tendency as the 71 genes. 
See Tables 9 - 1 5 for gene-sets. 



Table 9. 320 genes for classifier 



15 



Chip acc. # 


UniGene Build 162 


description 


AB000220_at 


Hs.171921 


NMJ)06379; sema- 
phore 3C 










Chip acc. # 


UniGene Build 162 


description 


AB000220_at 


Hs.171921 


NMJ)06379; sema- 
phorin 3C 


AC002073_cds1_at 






AF000231_at 


Hs.75618 


NM_004663; Ras- 
related protein Rab-11A 


D10922_s_at 


Hs.99855 


NMJ)01462;formy1 
peptide receptor-like 1 


D10925_at 


Hs.301921 


NM_001295; 
chemokine (C-C motiO 
receptor 1 


D11086_at 


Hs.84 


NM_000206; interteukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NM_001957; endothelln 
receptor type A 


D13435_at 


Hs.426142 


NM.002643; phos- 
phatidylinositol glycan, 
class F Isoform 1 
NM_1 73074; phos- 
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phatidylinositol glycan, 
class F Isoform 2 


D13666_s_at 


Hs.1 36348 


NM_006475; osteoblast 
specific factor 2 (fasci- 
c!(n l-like) 


D14520_at 


Hs.84728 


NM_001730; Kruppel- 
like factor 5 


D21878_at 


Hs.169998 


NMJ)04334; bone 
marrow stromal cell 
antigen 1 precursor 


D26443_at 


Hs.371369 


NM_004172; solute 
carrier family 1 (glia) 
high affinity glutamate 
transporter), member 3 


D28589_at 


Hs.17719 




D42046_at 


Hs.1 94665 




D45370_at 


Hs.74120 


NM_006829; adipose 
specific 2 


D49372_s_at 


Hs.64460 


NM_002986; small 
inducible cytokine A1 1 
precursor 


D5Q495_at 


Hs.224397 


NM_003195; transcrip- 
tion elongation factor A 
(Sll), 2 


D63135_at 


Hs.27935 


NM_032646; tweety 
homolog 2 


D64053_at 


Hs.1 98288 


NMJ)02849; protein 
tyrosine phosphatase, 
receptor type, R isoform 
1 precursor 
NIVM 30846; protein 
tyrosine phosphatase, 
receptor type, R isoform 
2 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 
precursor 


D85131_s_at 


Hs.433881 


NMJ)02383; MYC- 
associated zinc finger 
protein 


D86062_s_at 


Hs.413482 


NM_004649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM_001129; adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs.105751 


NM_014720; Ste20- 
related serine/threonine 
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kinase 


Doo97o_at 


Hs. 196914 




D87433_at 


Hs.301989 


NM_015136; stabilin 1 


D87443_at 


Hs.409862 


NMJ)14758; sorting 
nexln 19 


D87682_at 


Hs.1 34792 




D89077_at 


Hs.75367 


NM_006748; Src-like- 
adaptor 


D89377_at 


Hs.89404 


NM_002449; msh 
homeo box homolog 2 


uyu^79_js_at 


Hs.433695 


NM_000093; alpha 1 
type V collagen prepro- 
protein 


HG1QQR-HT9fMil at 

no 1 OTOTl 1 £.\J*t*r cat, 


















HG2QQ4-HTAJWI q at 






u/^Qnxyt-urro-yyio « at 
n\3ou***t-n i o f *t , z_s_ai 






HG3187-HT3366.s_.at 






HG3342-HT351 9_s_at 






HG371 -HT26388_s_at 






HG4069-HT433 9_s_at 






HG67-HT67_f_at 






HG907-HT907_at 






J02871js_at 


Hs.436317 


NM_000779; cyto- ' 
chrome P450, family 4, 
subfamily B, polypep- 
tide 1 


JQ3Q40_at 


Hs.1 11779 


NM_0031 18; secreted 
protein, acidic, cystelne- 
rich (osteonectin) 


J0o0ou__at 






judUoojat 






J03241_s_at 


Hs.2025 


NM_003239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NM_002609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925_at 


Hs.1 72631 


NM_000632; integrin 
alpha M precursor 


J04056_at 


Hs.88778 


NMJ501757; carbonyl 
reductase 1 


J04058_at 


Hs.1 6991 9 


NM_000126; electron 
transfer flavoprotein, 
alpha polypeptide 


J04093_s_at 


Hs.278896 


NM_019075; UDP 
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glycosyi transferase 1 
family, polypeptide A10 


J04130_s_at 


Hs.75703 


NM_002984; 
chemokine (C-C motif) 
ligand 4 precursor 


J04152_rna1_s_at 






J04162_at 


Hs.372679 


NM_000569; Fc frag- 
ment of IgG, low affinity 
Ilia, receptor for (CD16) 


J04456_at 


Hs.407909 


NM_002305; beta- 
galactosidase binding 
lectin precursor 


J05032,at 


Hs.32393 


NM__001349; aspartyl- 
tRNA synthetase 


J05036_s_at 


Hs.1355 


NMJM)1910; cathepsln 
E isoform a prepropro- 
tein NM_1 48964; ca- 
thepsln E Isoform b 
preproprotein 


J05070_at 


Hs.151738 


NMJ)04994; matrix 
metalloprotelnase 9 
preproprotein 


J05448_at 


Hs.79402 


NM_002694; DNA 
directed RNA poly- 
merase II polypeptide C 
NM_032940; DNA 
directed RNA poly- 
merase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or 
cysteine) proteinase 
inhibitor, dadeA(al- 
pha-1 antiproteinase, 
antitrypsin), member 1 


K03430_at 






L06797_s_at 


Hs.421986 


NMJK>3467; 
chemokine (C-X-C 
motif) receptor 4 


L10343_at 


Hs.1 12341 


NMJ)02638; skin- 
derived protease Inhibi- 
tor 3 preproprotein 


L11708_at 


Hs.155109 


NM_002153; hydroxys- 
teroid (17-beta) dehy- 
drogenase 2 


L13391_at 


Hs.78944 


NMJ502923; regulator 
of G-protein signalling 
2, 24kDa 


L13698_at 


Hs.65029 


NIVL002048; growth 
arrest-specific 1 
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L13720_at 


Hs.437710 


NMJ)00820; growth 
arrest-specific 6 


L13923_at 


Hs.750 


NM_000138; fibrillin 1 


Abuuu^u_at 


MS. 171 921 


NM_J306379; sema- 
phorin 3C 


AC002073_cds1_at 






AF000231_at 


■ ft ^ffA^ A 

Hs.75618 


NM_004663; Ras- 
reiated protein Rab-11A 


D10922_sjat 


Hs.99855 


NM_001462; formyl 
peptide receptor-like 1 


D10925_at 


Hs.301921 


NM_001295; 
chemokine (C-C motif) 
receptor 1 


D11086_at 


Hs.84 


NM_000206; interieukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NMJJ01957; endothelin 
receptor type A 


D13435_at 


Hs.426142 


NM.002643; phos- 
phatidyiinositol glycan, 
class F isoform 1 
NM_173074; phos- 
phatidylinositol glycan, 
class F Isoform 2 


D13666_s_at 


Hs.1 36348 


NM_006475; osteoblast 
specific factor 2 (fasci- 
clin Mike) 


D14520_at 


Hs.84728 


NMJ)01730; Kruppel- 
like factor 5 


DZ1 878jat 


Hs.1 69998 


NM_004334; bone 
marrow stromal cell 
antigen 1 precursor 


D26443_at 


Hs.371369 


NIM 004172; solute 
carrier family 1 (glial 
high affinity glutamate 
transporter), member 3 




Hs.1 771 9 




D42Q46_at 


Hs.194665 




D45370jat 


Hs.74120 


NM_006829; adipose 
specific 2 


D49372 s at 




iNiVijUQZyoD, small 
Inducible cytokine A11 
precursor 


D50495_at 


Hs.224397 


NM_003195; transcrip- 
tion elongation factor A 
(SII), 2 


D63135_at 


Hs.27935 


NMJ>32646; tweety 
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homolog 2 


D64053_at 


Hs.198288 


NMJJ02849; protein 
tyrosine phosphatase, 
receptor type, R isoform 
1 precursor 
NM_1 30846; protein 
tyrosine phosphatase, 
receptor type, R isoform 
2 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 
precursor 


D85131_s_at 


Hs.433881 


NMJ)Q2383; MYC- 
associated zinc finger 
protein 


D86Q62_s_at 


Hs.413482 


NM_004649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM_001129; adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs.105751 


NM_014720; Ste20- 
related serine/threonine 
kinase 


D86976_at 


Hs.1 96914 




D87433_at 


Hs.301989 


NMJ)15136;stabilin1 


D87443_at 


Hs.409862 


NMJ)14758; sorting 
nexin 19 


D87682_at 


Hs.134792 




D89077_at 


Hs,75367 


NM_006748; Src-like- 
adaptor 


D89377_at 


Hs.89404 


NM_002449; msh 
homeo box homolog 2 


D90279js_at 


Hs,433695 


NMJKJ0093; alpha 1 
type V collagen prepro- 
protein 


HG1996-HT2044_at 






HG2090-HT21 52_s_at 






HG2463-HT2559__at 






HG2994~HT4850_s_at 






nv?o044-n ro74Z_3_jat 






HG3187-HT3366_s_at 






HG3342-HT3519_s_at 






HG371-HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67Jf_at 






HG907-HT907_at 
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J02871_s_at 


Hs.436317 


NM_000779; cyto- 
chrome P450, family 4, 
subfamily B, polypep- 
tide 1 


moAxn *%* 

J03040__at 


Hs.1 11779 


NM_0031 1 8; secreted 
protein, acidic, cysteine- 
rich (osteonectin) 


JUoubU_at 






JQoObo_at 






J03241_s_at 


Hs.2025 


NMJ)03239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NM_002609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925 — at 


Hs.1 72631 


NMJ)00632; integrin 
alpha M precursor 


J04056_at 


Hs.88778 


NMJ501757; carbonyl 
reductase 1 


J04058_at 


Hs.1 6991 9 


NM_000126; electron 
transfer flavoprotein, 
alpha polypeptide 


J04093_S_at 


Hs.278896 


NMJ)19075; UDP 
glycosyltransferase 1 
family, polypeptide A10 


J04130jS_at 


Hs.75703 


NMJ)02984; 
chemokine (C-C motif) 
iigand 4 precursor 


J U4 1 o2_rna 1 _s__at 






J04162_at 


Hs.372679 


NM__000569; Fc frag- 
ment of IgG, low affinity 
Ilia, receptor for (CD1 6) 


J04456_at 


Hs.407909 


NMJ)02305; beta- 
gaiactosidase binding 
lectin precursor 


J05032_jat 


Hs.32393 


NM_001349; aspartyl- 
tRNA synthetase 


J05036_s_at 


Hs.1355 


NM_001910; cathepsin 
E isoform a prepropro- 
tein NM__1 48964; ca- 
inepsin t isoiorm D 
preproprotein 


J05070_at 


Hs.1 51 738 


NM_004994; matrix 
metaifoproteinase 9 
preproprotein 


J05448_at 


Hs.79402 


NM_002694; DNA 
directed RNA poly- 
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merase II polypeptide C 
NM.032940; DNA 
directed RNA poly- 
merase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or 
cysteine) proteinase 
inhibitor, clade A (al- 
pha-1 antlproteinase, 
antitrypsin), member 1 


K03430_at 






L06797_s_at 


Hs.421986 


NMJ)03467; 
chemokine (C-X-C 
motif) receptor 4 


L10343_at 


Hs.112341 


NMJ>02638; skin- 
derived protease inhibi- 
tor 3 preproprotein 


L11708at 


Hs.155109 


NM_002153; hydroxys- 
teroid (17-beta) dehy- 
drogenase 2 


L13391_at 


Hs.78944 


NMJ)02923; regulator 
of G-protein signalling 
2, 24kDa 


L13698_at 


Hs.65029 


NM_002048; growth 
arrest-specific 1 


L13720_at 


Hs.437710 


NM_000820; growth 
arrest-specific 6 


L13923_at 


Hs.750 


NMJ)00138; fibrillin 1 


AB000220_at 


Hs.171921 


NM_006379; sema- 
phorin 3C 


AC002073_cds1_at 






AF000231_at 


Hs.75618 


NM_004663; Ras- 
related protein Rab-11A 


D10922_s_at 


Hs.99855 


NM_001462;formyl 
peptide receptor-tike 1 


D10925_at 


Hs.301921 


NM_001295; 
chemokine (C-C motif) 
receptor 1 


D11086_at 


Hs.84 


NM_000206; Interteukin 
2 receptor, gamma 
chain, precursor 


D11151_pt 


Hs.211202 


IMM_001957; endothelin 
receptor type A 


D13435_at 


Hs.426142 


NM_002643; phos- 
phatldylinositol glycan, 
class F isoform 1 
NIVM 73074; phos- 
phatidylinositol glycan, 
class F isoform 2 
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D13666_s_at 


Hs.1 36348 


NMJJ06475; osteoblast 
specific factor 2 (fasci- 
din l-iike) 


D14520_at 


Hs.84728 


NM_001730; Kruppel- 
tike factor 5 


D21878_at 


Hs. 169998 


NM_004334; bone 
marrow stromal cell 
antigen 1 precursor 


D26443_at 


Hs.371369 


NM_004172; solute 
carrier family 1 (glial 
high affinity giutamate 
transporter), member 3 


D28589_at \ 


Hs.17719 




D42046_at 


Hs.194665 




D45370_jat 


Hs.74120 


NM_006829; adipose 
specific 2 


D49372_s_at 


Hs.54460 


NM_002986; small 
inducible cytokine A1 1 
precursor 


D50495_at 


Hs.224397 


NM_G03195; transcrip- 
tion elongation factor A 
(Sll), 2 


D63135_at 


Hs.27935 


NM_032646; tweety 
homolog 2 


D64053_at 


Hs.198288 


NMJ502849; protein 
tyrosine phosphatase, 
receptor type, R isoform 
1 precursor 
NNM 30846; protein 
tyrosine phosphatase, 
receptor type, R isoform 
2 


D83920_at 


Hs.440898 


NMJ502003; flcolin 1 
precursor 


D85131_s_at 


Hs.433881 


NM_002383; MYO 
associated zinc finger 
protein 


D86062_s_at 


Hs.413482 


NM_004649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM_001129; adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs.105751 


NM_014720; Ste20- 
related serine/threonine 
kinase 


D86976_at 


Hs.196914 
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D87433_at 


Hs.301989 


NM_015136; stabilin 1 


D87443_at 


Hs.409862 


NM_014758; sorting 
nexin 19 


D87682_at 


Hs.134792 




D89077_jat 


Hs.75367 


NMJ)06748; Src-like- 
adaptor ! 


D89377__at 


Hs.89404 


NM_002449; msh 
homeo box homoiog 2 


D90279 — S_at 


Hs. 433695 


NM_000093; alpha 1 
type V collagen prepro- 
protein 


nbiyyo-n I Zu44_at 






nva^uyu-n i a j o<6_j5_jai 






no^**oo-ri i zooy__ai 






VACtOQQA Pdfl o *a* 

no^ys^-n 1 ^oOU_J5_ai 






noou*w-n i of «w_j5 > jai 






HG3187-HT3366_s_at 






HG3342-HT3519_s_at 






HG371-HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67Jlat 






HG907-HT907_at 






J02871_s_at 


Hs.436317 


NMJ)00779; cyto- 
chrome P450, family 4, 
subfamily B, polypep- 
tide 1 


J03040_at 


Hs.1 11779 


NMJ)03118; secreted 
protein, acidic, cysteine- 
rlch (osteonectin) 


J03060_at 






J03068_at 






J03241_s_at 


Hs.2025 


NMJ503239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NMJ02609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925__at 


Hs.1 72631 


NMJ)00632; integrin 
alpha M precursor 


J04056_at 


Hs.88778 


NMJ301757; carbonyl 
reductase 1 


J04058_at 


Hs.169919 


NM_000126; electron 
transfer flavoprotein, 
alpha polypeptide 


JG4093_s_at 


Hs.278896 


NMJH9075; UDP 
glycosyltransferase 1 
family, polypeptide A10 
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J04130_s_at 


Hs.75703 


NM_002984; 
chemokine (C-C motif) 
ligand 4 precursor 


J04152_ma1_s__at 






JQ4162_at 


Hs.372679 


NM_000569; Fc frag- 
ment of IgG, low affinity 
Ilia, receptor for (CD16) 


J04456_at 


Hs.407909 


NM_002305; beta- 
galactosidase binding 
lectin precursor 


J05032_at 


Hs.32393 


NM_001349; aspartyl- 
tRNA synthetase 


J05036_s_at 


Hs.1355 


NMJ)01910; cathepsin 
E Isoform a prepropro- 
tein NM_1 48964; ca- 
thepsin E isoform b 
preproproteln 


J05070_at 


Hs.151738 


NM_004994; matrix 
metalloproteinase 9 
preproproteln 


J05448_at 


Hs.79402 


NM_002694; DNA 
directed RNA poly- 
merase II polypeptide C 
NM_032940; DNA 
directed RNA poly- 
merase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or 
cysteine) proteinase 
inhibitor, dadeA(al- 
pha-1 anti proteinase, 
antitrypsin), member 1 


K03430_at 






L06797_s_at 


Hs.421986 


NMJ)03467; 
chemokine (C-X-C 
motiO receptor 4 


L10343_at 


Hs.1 12341 


NM_002638; skin- 
derived protease inhibi- 
tor 3 preproproteln 


L11708_at 


Hs.155109 


NM_002153; hydroxys- 
teroid (17-beta) dehy- 
drogenase 2 


i- 1 ooy i_jai 


Ulo "TQQAA 

ns./oy44 


kill „l A. " 

NM_002923; regulator 
of G-proteln signalling 
2, 24kDa 


L13698_at 


Hs.65029 


NM__002048; growth 
arrest-specific 1 


L13720_at 


Hs.437710 


NM_000820; growth 
arrest-specific 6 
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L13923_at 


Hs.750 


NM.000138; fibrillin 1 


AB000220_at 


HS. 171 921 


NM_ 006379; sema- ( 
phorin 3C 


AC002073_cds1_at 






AF000231_at 


Hs.75618 


NM_004663; Ras- 
related protein Rab-11A 


D10922_s_at 


Hs.99855 


NM_001462; formyl 
peptide receptor-like 1 


D10925_at 


Hs.301921 


NM_001295; 
chemokine (C-C motif) 
receptor 1 


D11086_at 


Hs.84 


NMJD00206; interleukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NMJ)01957; endotheiin 
receptor type A 


D13435_at 


Hs.426142 


NMJ302643; phos- 
phatidyllnosito! glycan, 
class F isoform 1 I 
NNM73074; phos- 
phatidyl Inositol glycan, 
class F Isoform 2 


D13666_s_at 


Hs. 136348 


NMJ)06475; osteoblast 
specific factor 2 (fasci- 
clln Mike) 


D14520_at 


Hs.84728 


NM_001730; Kruppel- 
like factor 5 


D21878_at 


Hs.169998 


NM_004334; bone 
marrow stromal cell 
antigen 1 precursor 


D26443_at 


Hs.371369 


NMJ304172; solute 
carrier family 1 (glial 
high affinity glutamate 
transporter), member 3 


u^ooo^at 


Hs.17719 




D42046_at 


Hs.194665 




D45370_at 


Hs.74120 


NM_G06829; adipose 
specific 2 


D49372_s_at 


Hs.54460 


NM_002986; small 
Inducible cytokine A1 1 
precursor 


D50495_at 


Hs.224397 


NM_003195; transcrip- 
tion elongation factor A 
(Sll), 2 


D63135_at 


Hs.27935 


NMJ)32646; tweety 
homoiog 2 j 


D64053_at 


Hs.198268 


NM_002849; protein 
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tyrosine phosphatase, 

receptor type, R Isoform 

1 precursor 

NMJI 30846; protein 

tyrosine phosphatase, 

receptor type, R isoform 

2 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 
precursor 


D85131_s_at 


Hs.433881 


NM.002383; MYC- 
associated zinc finger 
protein 


D86062_s__at 


Hs.413482 


NM_004649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM_001129; adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs. 105751 


NIVL014720; Ste20- 
related serine/threonine 
kinase 


Uoo97o_j at 


Hs. 196914 




D87433_at 


Hs.301989 


NMJ)15136; stabiiin 1 


Do7443_at 


Hs .409862 


NM_014758; sorting 
nexin 19 


D87682_at 


Hs. 134792 




D89077_at 


Hs.75367 


NMJ)06748; Src-like- 
adaptor 


D89377_at 


Hs. 89404 


NM_002449; msh 
homeo box homolog 2 


uyu/i ( y__s_jat 


HS .433695 


NM_000093; alpha 1 
type V collagen prepro- 
protein 


HfilQQfi-HTPflAA at 






Htt9nQfUI-rT91'R9 Q a* 
no^usu-n i 1 sj/l o at 






n\Jifc*rOO-n I £ggg dl 






Wf59QQA-WTAR^n e at 






t-inir\A a _urroTyi o e at 
nuoim-n i o # **x_s_ai 






HG3187-HT3366_s_at 






HG3342-HT3519 s at 






HG371 -HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67_f_at 






HG907-HT907_at 






J02871_s_at 


Hs.436317 


NM_000779; cyto- 
chrome P450, family 4, 
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subfamily B f polypep- 
tide 1 


J03040_at 


Hs.111779 


NM_003118; secreted 
prote!n, acidic, cystelne- 
rich (osteonectin) 


J03060_at 






J03068_at 






J03241_s_at 


Hs.2025 


NM_003239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NM_002609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925_at 


Hs.172631 


NM_000632; integrin 
alpha M precursor 


J04056_at 


Hs.88778 


NM_001757; carbonyi 
reductase 1 


J04058_at 


Hs.169919 


NMJ)00126; electron 
transfer flavoprotein, 
alpha polypeptide 


J04093_s_at 


Hs.278896 


NMJ)19075; UDP 
glycosyltransferase 1 
family, polypeptide A10 


J04130_s_at 


Hs.75703 


NM_002984; 
chemokine (C-C motif) 
ligand 4 precursor 


J 041 52_ma1_s_at 






J04162_at 


Hs.372679 


NM_000569; Fc frag- 
ment of IgG, low affinity 
Ilia, receptor for (CD16) 


J04456_at 


Hs.407909 


NMJH)2305; beta- 
galactosldase binding 
lectin precursor 


J05032_at 


Hs.32393 


NM_001349; aspartyl- 
tRNA synthetase 


J05036_s_at 


Hs.1355 


NM_001910; cathepsln 
E Isoform a prepropro- 
tein NM_148964; ca- 
thepsin E isoform b 
preproprotein 


J05070__at 


Hs.151738 


NM_004994; matrix 
metalloproteinase 9 
preproprotein 


J05448_at 


Hs.79402 


NM_002694; DNA 
directed RNA poly- 
merase II polypeptide C 
NM__032940; DNA 
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directed RIMA poly- 
merase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or 
cysteine) proteinase 
Inhibitor, ciade A (al- 
pha-1 antiproteinase, 
antitrypsin), member 1 


l\Uo43U_at 






L06797_s__at 


Hs.421986 


NM_003467; ! 
chemokine (C-X-C 
motif) receptor 4 


L10343_at 


Hs.1 12341 


NM_002638; skin- 
derived protease inhibi- 
tor 3 preproprotein 


L11708_at 


Hs.155109 


NM_002153; hydroxys- 
terold (17-beta) dehy- 
drogenase 2 


Li oo» i_at 


Hs.78944 


NM_002923; regulator 
of G-protein signalling 
2, 24kDa 


1 i^ftQfl at 
LI gOgOJal 


nS.bou^y 


NM_JJ02048; growth 
arrest-specific 1 


L13720_at 


Hs.437710 


NMJJ00820; growth 
arrest-specific 6 


L13923_at 


Hs.750 


NMJ)00138; fibrillin 1 




MS.T7T9Z1 


NM_ 006379; sema- 
phore 3C 


AC002073_cds1_at 






Mr uuu^oi _j at 


MS.7561 8 


NM_004663; Ras- 
related protein Rab-1 1 A 


D10922_s_at 


Hs.99855 


NM_001462;formyl 
peptide receptor-like 1 


D10925_at 


..Hs.301921 


NMJJ01295; 
chemokine (C-C motif) 
receptor 1 


D11086_at 


Hs.84 


NM.000206; interleukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NM_001957; endothelin 
receptor type A 


D13435 at 


Hs 426142 


KIM nnOftAQ. nhrte 

iNivi_uuwtD*fo l pnos- 
phatidyiinositol glycan, 
class F isoform 1 
NM_173074; phos- 
phatidyllnositol glycan, 
class F isoform 2 


D13666s_at 


Hs.136348 


NM_Q06475; osteoblast 
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specific factor 2 (fasci- 
clin Mike) 


D14520_at 


Hs.84728 


NM_001730; Kruppel- 
like factor 5 


D21878_at 


Hs.169998 


NM_004334; bone 
marrow stroma! cell 
antigen 1 precursor 


D26443_at 


Hs.371369 

* 


NMJ)04172; solute 
carrier family 1 (glial 
high affinity glutamate 
transporter), member 3 


D28589_at 


Hs.17719 




D42046_at 


Hs. 194665 




D45370_at 


Hs.74120 


NM_006829; adipose 
specific 2 


D49372_s_at 


Hs.54460 


NM_002986; small 
inducible cytokine A1 1 
precursor 


D50495_at 


Hs.224397 


NMJD03195; transcrip- 
tion elongation factor A 
<SII),2 


D63135_at 


Hs.27935 


NM_032646; tweety 
homolog 2 


D64053_at 


Hs. 198288 


NM_002849; protein 
tyrosine phosphatase, 
receptor type, R isoform 
1 precursor 
NM_1 30846; protein 
tyrosine phosphatase, 
receptor type, R isoform 
2 


D83920jat 


Hs.440898 


NMJ)02003; ficolin 1 
precursor 


D85131_s_at 


Hs.433881 


NMJ)02383; MYO 
associated zinc finger 
protein 


D86062_s_at 


Hs.413482 


NM_004649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM_001 129; adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs.105751 


NM_014720; Ste20- 
related serine/threonine 
kinase 


D86976_at 


Hs.196914 
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D87433_at 


Hs.301989 


NM_015136; stabilin 1 


D87443_at 


Hs.409862 


NM_014758; sorting 
nexln 19 


D87682_at 


Hs.1 34792 




D89077_at 


Hs.75367 


NMJ)06748; Src-like- 
adaptor 


D89377_at 


Hs. 89404 


NM_002449; msh 
homeo box homolog 2 


D90279 s at 




iNivi^uuuuyo, aipna i 
type V collagen prepro- 
protein 


HG1996-HT2044_at 






HG2090-HT2152_s_at 






HG2463-HT2559_at 






HG2994-HT4850_s_at 







Table 10. 160 Genes for classifier 



Chip acc. # 


UnlGene Build 162 


description 


AF000231_at 


Hs.75618 


NM_004663; Ras-related protein Rab-11A 


D13666_s_at 


Hs.136348 


NMJ)06475; osteoblast specific factor 2 (fesciciln l-like) 


D21878_at 


Hs.1 69998 


NM_004334; bone marrow stromal cell antigen 1 precursor 


D45370_at 


Hs.74120 


NM_006829; adipose specific 2 


D49372_s_at 


Hs.54460 


NMJ)02986; small inducible cytokine A1 1 precursor 


D83920_at 


Hs.440898 


NM_002003; ficoiin 1 precursor 


D85131_s_at 


Hs.433881 


NM_002383; MYC-associated zinc finger protein 


D86062_s_at 


Hs.413482 


NM_004649; chromosome 21 open reading frame 33 


D86479_at 


Hs.439463 


NM_001129; adipocyte enhancer binding protein 1 precursor 


D86957_at 


Hs.307944 




D86976_at 


Hs.1 96914 




D87433_at 


Hs.301989 


NM_015136; stabilin 1 


D89077_at 


Hs.75367 


NM_006748; Src-like-adaptor 


D89377_at 


Hs.89404 


NMJD02449; msh homeo box homolog 2 


HG3044~HT3742_s_at 






HG371-HT26388_s_at 






HG4Q69-HT4339_s_at 






HG67-HT67_f_at 






HG907-HT907_at 






J02871_s_at 


Hs.436317 


NMJ)00779; cytochrome P450, family 4 f subfamily B, polypeptide 
1 


J03040_at 


Hs.1 11 779 


NM_003118; secreted protein, acidic, cysteine-rlch (osteonectin) 


J03068_at 






J03241_s_at 


Hs.2025 


NM_003239; transforming growth factor, beta 3 


J03278_at 


Hs.307783 


NM_002609; platelet-derived growth factor receptor beta precursor 


J03909_at 






J04058_at 


Hs.1 6991 9 


NM_000126; electron transfer flavoprotein, alpha polypeptide 
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J04130_s_at 


Hs.75703 


NM_002984; chemoklne (C-C motif) iigand 4 precursor 


J04162_at 


Hs.372679 


NM_000569; Fc fragment of IgG, low affinity liia, receptor for 
(CD16) 


J04456_at 


Hs.407909 


NMJD02305; beta-gaiactosidase binding lectin precursor 


J05032_at 


Hs.32393 


NM_001349; aspartyl-tRNA synthetase 


J05070_at 


Hs.1 51738 


NM_004994; matrix metalloproteinase 9 preproprotein 


J05448_at 


Hs.79402 


NM_002694; DNA directed RNA polymerase It polypeptide C 
NM__032940; DNA directed RNA polymerase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or cysteine) proteinase inhibitor, clade A 
(alpha-1 antiproteinase, antitrypsin), member 1 


K03430_at 






L13698_at 


Hs.65029 


NMJ)02Q48; growth arrest-specific 1 


L13720_at 


Hs.437710 


NM_000820; growth arrest-specific 6 


L13923_at 


Hs.750 


NM_000138; fibrillin 1 


L15409_at 


Hs.421597 


NM_000551; elogin binding protein 


L17325_at 


Hs.1 95825 


NM J)06867; RNA-blnding protein with multiple splicing 


L19872_at 


Hs.170087 


NM_001621; aryl hydrocarbon receptor 


L27476_at 


Hs.75608 


NM_J)04817; tight Junction protein 2 (zona occludens 2) 


L33799_at 


Hs.202097 


NMJ)02593; procollagen C-endopeptidase enhancer 


L40388_at 


Hs.30212 


NMJ)04236; thyroid receptor interacting protein 15 


L40904_at 


Hs.387667 


NM_005037; peroxisome proliferative activated receptor gamma 
isofbrm 1 NM_015869; peroxisome proliferative activated receptor 
gamma isoform 2 NMjl 38711; peroxisome proliferative activated 
receptor gamma isoform 1 NIVM38712; peroxisome proliferative 
activated receptor gamma isoform 1 


L41919jma1_at 






M11433_at 


Hs.101850 


NM_002899; retinol binding protein 1, celiuiar 


M11718_at 


Hs.283393 


NM_000393; alpha 2 type V collagen preproprotein 


M12125_at 


Hs.300772 


NM_003289; tropomyosin 2 (beta) 


M14218_at 


Hs.442047 


NM_000048; argininosuccinate lyase 


M15395_at 


Hs.375957 


NM_000211; integrin beta chain, beta 2 precursor 


M16591_s_at 


Hs.89555 


NM_002110; hemopoietic cell kinase isoform p61HCK 


M17219_at 


Hs.203862 


NMJ)02069; guanine nucleotide binding protein (G protein), alpha 
inhibiting activity polypeptide 1 


M20530_at 






M23178_s_at 


Hs.73817 


NM_002983; chemokine (C-C motif) Iigand 3 


M28130_ma1_s_jat 






M29550_at 


Hs.1 87543 


NMJ)21132; protein phosphatase 3 (formerly 2B), catalytic sub- 
unit, beta Isoform (calcineurin A beta) 


M31165_at 


Hs.407546 


NM_0071 15; tumor necrosis factor, alpha-induced protein 6 pre- 
cursor 


M32011_at 


Hs.949 


NM_000433; neutrophil cytosolic factor 2 


M33195_at 


Hs.433300 


NMJ)04106; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


M37033_at 


Hs.443057 


NM_000560; CD53 antigen 


M37766_at 


Hs.901 


NMJJ01778; CD48 antigen (B-cell membrane protein) 


M55998_s_at 


Hs.172928 


NMJJ00088; alpha 1 type I collagen preproprotein 
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M57731_s_at 


Hs.75765 


NM_002089; chemoklne (C-X-C motif) Hgand 2 


M62840_at 


Hs.82542 


NM_Q01637; acyloxyacyl hydrolase precursor 


M63262__at 






M68840_at 


Hs.183109 


NM_000240; monoamine oxidase A 


M69203_s_at 


Hs.75703 


NMJ)02984; chemoklne (C-C motif) ligand 4 precursor 


M72885_ma1_s_at 






M77349_at 


Hs.421496 


NM_000358; transforming growth factor, beta-induced, 68kDa 


M82882_at 


Hs.1 24030 


NM_1 72373; E74-like factor 1 (ets domain transcription factor) 


M83822_at 


Hs.209846 


NM_006726; LPS-responsive vesicle trafficking, beach and anchor 
containing 


M92934_at 


Hs.410037 


NM_001901; connective tissue growth factor 


M95178_at 


Hs.1 19000 


NMJD01102; actinin, alpha 1 


S69115_at 


Hs.10306 


NMJ)05601; natural killer ceil group 7 sequence 


S77393_at 


Hs.145754 


NM_016531; Kruppel-like factor 3 (basic) 


S78187_at 


Hs.1 53752 


NMJJ04358; ceil division cycle 25B isoform 1 NMJ)21872; cell 
division cycle 25B isoform 2 NM_021873; cell division cycle 25B 
isoform 3 NM_021874; cell division cycle 25B isoform 4 


U01833__at 


Hs.81469 


NM.002484; nucleotide binding protein 1 (MinD homolog, E. coll) 


U07231_at 


Hs.309763 


NMJ302092; G-rich RNA sequence binding factor 1 


U09278_at 


Hs.436852 


NM _004460; fibroblast activation protein, alpha subunit 


U09937_jna1_s_at 






U10550_at 


Hs.79022 


NMJD05261; GTP-binding mitogen-induced T-cell protein 
NM_181702; GTP-binding mitogen-induced T-cell protein 


U12424_s_at 


Hs.108646 


NM_00Q408; glycerol-3-phosphate dehydrogenase 2 (mitochon- 
drial) 


U16306_at 


Hs.434488 


NM_004385; chondroitin sulfate proteoglycan 2 (versican) 


U20158_at 


Hs.2488 


NM_005565; lymphocyte cytosolic protein 2 


U20536_s_at 


Hs.3280 


IMMJ)01226; caspase 6 isoform alpha preproprotein NMJ)32992; 
caspase 6 isoform beta 


U24266_at 


Hs.77448 


NM_003748; aldehyde dehydrogenase 4A1 precursor 
NMJI70726; aldehyde dehydrogenase 4A1 precursor 


U28249_at 


Hs.301350 


NMJD05971; FXYD domain containing Ion transport regulator 3 
isoform 1 precursor NM_021910; FXYD domain containing ion 
transport regulator 3 isoform 2 precursor 


U28488_s_at 


Hs.155935 


NM_004054; complement component 3a receptor 1 


U29680_at 


Hs.227817 


NM_0Q4049; BCL2-related protein A1 


U37143_at 


Hs.1 52096 


NM_000775; cytochrome P450, family 2, subfamily J f polypeptide 
2 


U38864_at 


Hs.108139 


NMJ)12256; zinc finger protein 212 


U39840_at 


Hs.163484 


NM_004496; forkhead box A1 


U4 lo 1 5_jnal _s_at 






U44111_at 


Hs.42151 


NM_006895; histamine N-methyltransferase 


U47414_at 


Hs.1 3291 


NM_004354; cyclin G2 


U49352_at 


Hs.414754 


NM_001359; 2,4-dienoyI CoA reductase 1 precursor 


U50708_at 


Hs.1 265 


NM_000056; branched chain keto acid dehydrogenase E1, beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1 , beta polypeptide precursor 
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U52101_at 


Hs.9999 


NM_001425; epithelial membrane protein 3 


U59914_at 


Hs.153863 


NMJXJ5585; MAD, mothers against decapentaplegic homolog 6 


U60205_at 


Hs.393239 


NM_006745; steroi-C4-methyl oxidase-like 


U61981_at 


Hs.42674 


NMJ)02439; mutS homolog 3 


U64520_at 


Hs.66708 


NM_004781; vesicle-associated membrane protein 3 (cellubrevin) 


U65093_at 


Hs.82071 


NM_006079; Cbp/p300-interacting transactlvator, with Glu/Asp- 
rich carboxy-terminal domain, 2 


U66619_at 


Hs.444445 


NM_003078; SWI/SNF-related matrix-associated actin-dependent 
regulator of chromatin d3 


U68019_at 


Hs.288261 


NMJ305902; MAD, mothers against decapentaplegic homolog 3 


U68385_at 


Hs.380923 




U68485_at 


Hs.193163 


NM_004305; bridging integrator 1 isoform 8 NM_139343; bridging 
integrator 1 isoform 1 NM_1 39344; bridging integrator 1 isoform 2 
NMjl 39345; bridging integrator 1 isoform 3 NM_139346; bridging 
Integrator 1 isoform 4 NM_1 39347; bridging integrator 1 isoform 5 
NM_1 39348; bridging integrator 1 isoform 6 NMjl 39349; bridging 
Integrator 1 isoform 7 NMjl 39350; bridging integrator 1 isoform 9 
NM_1 39351; bridging integrator 1 1soform 10 


U74324jat 


Hs.90875 


NMJ)02871; RAB-interacting factor 


U77970_at 


Hs.321164 


NMJ)02518; neuronal PAS domain protein 2 NM_032235; 


U83303_cds2_at 


Hs.1 64021 


NM_002993; chemokine (C-X-C motif) llgand 6 (granulocyte 
chemotactic protein 2) 


U88871_at 


Hs.79993 


NM_000288; peroxisomal biogenesis factor 7 


U90549_at 


Hs.236774 


NM_006353; high mobility group nucleosomal binding domain 4 


U90716_at 


Hs.79187 


NM_001338; coxsackie virus and adenovirus receptor 


V00594_at 


Hs.1 18786 


NMJJ05953; metaliothioneln 2A 


V00594_s_at 


Hs.1 18786 


NM_005953; metaliothioneln 2A 


X02761_s_at 


Hs.418138 


NM_002026; fibronectin 1 isoform 1 preproprotein NM_054034; 
fibronectin 1 Isoform 2 preproprotein 


X0401 1jat 


Hs.88974 


NM_000397; cytochrome b-245, beta polypeptide (chronic granu- 
lomatous disease) 


X04085_jma1 _at 






X07438_s_at 






X07743_at 


Hs.77436 


NM_002664; pleckstrin 


X13334_at 


Hs.75627 


NM_000591; CD14 antigen precursor 


X14046_at 


Hs.1 53053 


NM_001774; CD37 antigen 


X14813_at 


Hs.166160 


NM_001607; acetyl-Coenzyme A acyltransferase 1 


X15880_at 


Hs.415997 


NM_001848; collagen, type VI, alpha 1 precursor 


X1 5882_at 


Hs.420269 


NMJD01849; alpha 2 type VI collagen isoform 2C2 precursor 
NM_058174; alpha 2 type VI collagen isoform 2C2a precursor 
inm_058175; alpha 2 type VI collagen isoform 2C2a precursor 


X51408_at 


Hs.380138 


NM_001822; chimerin (chlmaerin) 1 


X53800_s_at 


Hs.89690 


NM_002090; chemokine (C-X-C motif) llgand 3 


X54489_ma1_at 






X57351_s_at 


Hs.174195 


NMJ)06435; interferon induced transmembrane protein 2 (1-8D) 


X57579_s_at 






X58072_at 


Hs.1 69946 


NMJ)02051; GATA binding protein 3 NM_032742; 
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X62Q48at 


Hs.249441 


NM_003390; weel tyrosine kinase 


X64072_s_at 


Hs.375957 


NM_000211; Integrin beta chain, beta 2 precursor 


X65614_at 


Hs.2962 


NM_0Q5980; S100 calcium binding protein P 


X66945_at 


Hs.748 


NM_000604; fibroblast growth factor receptor 1 1soform 1 precur- 
sor NMJ015850; fibroblast growth factor receptor 1 isoform 2 
precursor NM_023105; fibroblast growth factor receptor 1 isoform 
3 precursor NM_023106; fibroblast growth factor receptor 1 iso- 
form 4 precursor NM_023107; fibroblast growth factor receptor 1 
Isoform 5 precursor NM_023108; fibroblast growth factor receptor 
1 Isoform 6 precursor NM__023109; fibroblast growth factor recep- 
tor 1 isoform 7 precursor NM_023110; fibroblast growth factor 
receptor 1 isoform 8 precursor NM_0231 1 1 ; fibroblast growth 
factor receptor 1 isoform 9 precursor 


X67491_f_at 


Hs.355697 


NMJ)05271 ; glutamate dehydrogenase 1 


X68194_at 


Hs.80919 


NM_006754; synaptophysln-like protein isoform a NNM82715; 
synaptophysin-like protein Isoform b 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NM_001829; chloride channel 3 


X78549_at 


Hs.51133 


NM_005975; PTK6 protein tyrosine kinase 6 


X78565_at 


Hs.98998 


NM_002160; tenascln C (hexabrachion) 


X78669_at 


Hs.79088 


NM_002902; reticu local bin 2, EF-hand calcium binding domain 


X83618_at 


Hs.59889 


NMJ305518; 3-hydroxy-3-methylglutaryl-Coenzyme A synthase 2 
(mitochondrial) 


X84908_at 


Hs.78060 


NM JJ00293; phosphorylase kinase, beta 


X90908_at 


Hs.147391 


NM_001445; gastrotropin 


X91504_at 


Hs.389277 


NM_003224; ADP-ribosylation factor related protein 1 


X95632_s_at 


Hs.387906 


NM_005759; abl-interactor 2 


X97267_rna1_s_at 






Y00705_at 


Hs.407856 


NM_003122; serine protease inhibitor, Kazal type 1 


Y00787_s_at 


Hs.624 


NM_000584; interleukin 8 precursor 


Y00815_at 


Hs.75216 


NM_002840; protein tyrosine phosphatase, receptor type, F iso- 
form 1 precursor NM_1 30440; protein tyrosine phosphatase, 
receptor type, F isoform 2 precursor 


Y08374_rna1_at 






Z12173_at 


Hs.334534 


NMJQ02076; glucosamine (N-acetyl)-6-sulfatase precursor 


Z19554_s_at 


Hs.435800 


NM_J)03380; vimentin 


Z26491_s_at 


Hs.240013 


NM_000754; catechoI-Omethyi transferase isoform MB-COMT 
NM_007310; catechol-O-methyitransferase isoform S-COMT 


Z29331_at 


Hs.372758 


NMJ)03344; ubiquitin-conjugating enzyme E2H isoform 1 
NIVM 82697; ubiquitin-conjugating enzyme E2H Isoform 2 


Z35491_at 


Hs.377484 


NMJ04323; BCL2-associated athanogene isoform 1L 


Z48199_at 


Hs.82109 


NM_002997; syndecan 1 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 NM_1 76865; 
NM_1 76866; inorganic pyrophosphatase 2 Isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 Isoform 4 NM_1 76869; Inorganic 
pyrophosphatase 2 Isoform 1 


Z74615_at 


Hs.1 72928 


NM_000088; alpha 1 type I collagen preproprotein 
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AFQ00231 at 


He 7«ifi1 8 

MO. f v/O 1 O 


MM nrMfifi*}* pcac.ro lei tori nrntoin Rah.HA 


D13666_s_at 


Hs.1 36348 


NMJ)06475; osteoblast specific factor 2 (fasclclln l-like) 


rMQ*V79 q at 




iNivj^LMj^yoD, smau inauciDie cyiOKine at i precursor 




Me /MnRQR 


nm_uu^iuuo, 11 conn i precursor 


DRR47Q at 




NM_001129; adipocyte enhancer binding protein 1 precursor 




ns.ou i you 


inm_uioiod, staoiiin i 


nftQA77 at 


Me 7*\*"t£7 
no. / DoD/ 


Nivi_uuo74o, orc-HKe-aaaptor 


D89377_at 


Hs.89404 


NM_002449; msh homeo box homolog 2 


no^fuoy-ri 1 *h*»oy_s_at 






HG67-HT67J_at 






noyu/-ri 1 ytlf _ai 






j \j£.o f i __s__at 


HS.43D31 7 


NM_000779; cytochrome P450, family 4, subfamily B, polypeptide 
1 


J03278_at 


Hs.307783 


NMJ)026u9; platelet-derived growth factor receptor beta precursor 


J04058_at 


Hs.169919 


NM_000126; electron transfer flavoprotein, alpha polypeptide 


J05032_at 


Hs.32393 


NM_001349; aspartyt-tRNA synthetase 


J05070_at 


Hs.151738 


NM_004994; matrix metalloproteinase 9 preproprotein 


J05448_at 


Hs.79402 


NM_002694; DNA directed RNA polymerase II polypeptide C 
NM_032940; DNA directed RNA polymerase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or cysteine) proteinase inhibitor, clade A 
(alpha-1 antiprotelnase, antitrypsin), member 1 


Llo720_at 


Hs.437710 


NMJ)00820; growth arrest-specific 6 


L4Uyu4_at 


Hs.387667 


NM_005037; peroxisome proliferative activated receptor gamma 
Isoform 1 NMJ315869; peroxisome proliferative activated receptor 
gamma Isoform 2 NNM 38711; peroxisome proliferative activated 
receptor gamma Isoform 1 NIVM38712; peroxisome proliferative 
activated receptor gamma isoform 1 


at 


nS.oUU77*: 


NM_003289; tropomyosin 2 (beta) 


M1*5*"W5 at 

IVI 1 JJoU Ol 


ns.of oyof 


NM_000211; integrin beta chain, beta 2 precursor 


M165Q1 e. at 


Me RQ^t"*^ 

rvs.oyooD 


NM_0u2110; hemopoietic cell kinase isoform p61HCK 


M20530 at 

IVICUwUU CI L 






M23178_s_at 


Hs.73817 


NM 002983* chemoklne (d-d mntift llnanri ^ i 


M32011_at 


Hs.949 


NM_000433; neutrophil cytosolic factor 2 


M33195_at 


Hs.433300 


NMJ)041 06; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


M55998_s_at 


Hs.1 72928 


NM_000088; alpha 1 type I collagen preproprotein 


M57731_s_at 


Hs.75765 


NM_002089; chemokine (C-X-C motif) ligand 2 


M63262_at 






M68840_at 


Hs.183109 


NMJ300240; monoamine oxidase A 


M69203_s__at 


Hs.75703 


NMJ)02984; chemoklne (C-C motif) ligand 4 precursor 


M72885_ma1_s - at 






M83822_at 


Hs.209846 


NMJ)06726; LPS-responsive vesicle trafficking, beach and anchor 
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containing 


S77393_at 


Hs.145754 


NM_016531; KruppeHike factor 3 (basic) 


1 AAA » 

U01833_at 


Hs.81469 


NM_002484; nucleotide binding protein 1 (MinD homolog, E. coil) 


ft ia^AA A\ JL 

U07231_at 


• ■ AAA*^rAA 

Hs.309763 


fc. ftA * AAAAAA. ^% _f t — % V 1 A » • ■* * i 

NM_002092; G-rich RNA sequence binding factor 1 


■ IAAAA** Jk _ — .a 

U09937_rna1_s_at 






U10550_at 


Hs.79022 


NM_005261; GTP-binding mitogen-induced T-cell protein 

kill 4 04*tAO« 4 1 1 1 > <_ ■ _ _t f _ _. H _ _ _ _ T _ _i . . ^_ — r **ll - — — -* ^ > - 

NM_181702; GTP-binding mitogen-induced T-cell protein 


U20158_at 


Hs.2488 


NM_005565; lymphocyte cytosollc protein 2 


U28488_s_at 


Hs. 155935 


NM_004054; complement component 3a receptor 1 


U29680_at 


Hs.227817 


NM_004049; BCL2-related protein A1 


U41 31 5_ma1_s_at 






U47414_at 


Hs.13291 


NM_004354; cyclin G2 


U49352_at 


Hs.414754 


NMJ301359; 2,4-dienoyl CoA reductase 1 precursor 


U50708_at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase E1, beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U52101_ at 


Hs.9999 


NM_001425; epithelial membrane protein 3 


U59914_at 


Hs. 153863 


NM_005585; MAD, mothers against decapentaplegic homolog 6 


Uo4520_at 


Hs.66708 


NM_004781; vesicle-associated membrane protein 3 (ceilubrevin) 


U65093_at 


Hs.82071 


IMM_006079; Cbp/p300-lnteracting transactivator, with Glu/Asp- 
rich carboxy-terminai domain, 2 


uooui y_at 


Hs.288261 


Ik U I A Ap AA A. AAA I"* * * i * ■ ■ ft * « 

NM_005902; MAD, mothers against decapentaplegic homolog 3 


uooooo_at 


nS.380923 




U74o24_at 


Hs.90875 


NM_002871 ; RAB-interactlng factor 


U7 797U_at 


Hs.321164 


NM_002518; neuronal PAS domain protein 2 NM_J)32235; 


U9Qo49_at 


Hs.236774 


NM_006353; high mobility group nucleosomal binding domain 4 


X04085_ma1_at 






X07438_s_at 






X07743_at 


Hs.77436 


NMJJ02664; pleckstrin 


X13334_at 


Hs.75627 


NM_000591; CD14 antigen precursor 


X14046_at 


Hs. 153053 


NMJ)01774; CD37 antigen 


X15880_at 


Hs.415997 


NM_001848; collagen, type VI, alpha 1 precursor 


X15882_at 


Hs.420269 


NMJ)01849; alpha 2 type VI collagen isoform 2C2 precursor 
NMJ)58174; alpha 2 type VI collagen Isoform 2C2a precursor 
NM_058175; alpha 2 type VI collagen isoform 2C2a precursor 


AOl4Uo_at 


Hs.380138 


NMJ)01822; chimerln (chimaerin) 1 


X53800_s_at 


Hs.89690 


NM_002090; chemoklne (C-X-C motif) ligand 3 


X54489__ma 1 __at 






X57579__S_at 






X62048jat 


Hs.249441 


NM_003390; weel tyrosine kinase 


X64072 q at 




iNM_uuozi 1 ; integnn beta chain, beta 2 precursor 


X67491_f_at 


Hs.355697 


NMJ505271 ; glutamate dehydrogenase 1 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein isoform a NM.J 82715; 
synaptophysin-like protein Isoform b 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NMJ)01829; chloride channel 3 


X97267_rna1.j3_at 
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Y00787_s_at 


Hs.624 


NM_000584; Interleukin 8 precursor 


Z12173_at 


Hs.334534 


NM_002076; glucosamine (N-acetyl)-6-sulfatase precursor 


Z19554_s_at 


Hs.435800 


NM_003380; vimentin 


Z26491_s_at 


Hs.240013 


NMJX)0754; catechol-Omemyltransferase Isoform MB-COMT 
NM__007310; catechol-O-methyltransferase isoform S-COMT 


Z29331_at 


Hs.372758 


NM_003344; ubiquitin-conjugating enzyme E2H isoform 1 
NM_1 82697; ubiquitin-conjugating enzyme E2H Isoform 2 


Z48605_at 


Hs.421825 


NM_006903; Inorganic pyrophosphatase 2 Isoform 2 NM.J76865; 
NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NM_1 76869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs.172928 


NM_000088; alpha 1 type I collagen preproproteln 


Table 12. 40 genes for classifier 


Chip acc. # 


UnlGene Build 162 


description 


D83920_at 


Hs.440898 


NMJD02003; ficolin 1 precursor 


D89377_at 


Hs.89404 


NMJ)02449; msh homeo box homolog 2 


J02871_s_at 


Hs. 4363 17 


NM_000779; cytochrome P450, family 4, subfamily B, polypeptide 
1 


J05032_at 


Hs.32393 


NM__001349; aspartyl-tRNA synthetase 


J05070_at 


Hs.151738 


NM_004994; matrix metal loprotelnase 9 preproproteln 


M16591_s_at 


Hs.89555 


NM_002110; hemopoietic cell kinase isoform p61HCK 


M23178_s_at 


Hs.73817 


NM_002983; chemokine (C~C motif) iigand 3 


M32011_at 


Hs.949 


NM_000433; neutrophil cytosoiic factor 2 


M33195_at 


Hs.433300 


NM__004106; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


M57731_s_at 


Hs.75765 


NM_002089; chemokine (C-X-C motif) Iigand 2 


M68840_at 


Hs.183109 


NM_000240; monoamine oxidase A 


M69203_s_at 


Hs.75703 


NM_002984; chemokine (C-C motif) iigand 4 precursor 


S77393_at 


Hs.145754 


NMJD16531; KruppeWike factor 3 (basic) 


U01833_at 


Hs.81469 


NM_002484; nucleotide binding protein 1 (MinD homolog, E. coli) 


U07231_at 


Hs.309763 


NM_002092; G-rich RNA sequence binding factor 1 


U09937_rna1_s_at 






U20158_at 


Hs.2488 


NM_005565; lymphocyte cytosoiic protein 2 


U41315_ma1_sjat 






U47414_at 


Hs.13291 


NM_0Q4354; cyclin G2 


U49352_at 


Hs.414754 


NMJ)01359; 2,4-dienoyl CoA reductase 1 precursor 


U50708_at 


Hs.1265 


NMJ>00056; branched chain keto acid dehydrogenase E1 f beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U65093_at 


Hs.82071 


NM_006079; Cbp/p300-Interacting transactivator, with Giu/Asp- 
rich carboxy-terminal domain, 2 


U68385_at 


Hs.380923 




U77970_at 


Hs.321164 


NM_002518; neuronal PAS domain protein 2 NM_032235; 


U90549_at 


Hs.236774 


NM_006353; high mobility group nucleosomal binding domain 4 


X13334_at 


Hs.75627 


NM_000591; CD14 antigen precursor 
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/\ 1 UOOU dl 


tlS.4 lOssr 


iNivi uu i o*tc5, coiiagen, lype vi, aipna t preuuroor 


Y1^RR9 at 




iNivi_uuio«ty, aipna 4& type vi collagen isoiunn £S~*c. precursor 
iNivi^ uoo 1 1 *+, aipiici ^ type vi collagen iouiumm £-v-»^a picuuiour 
My O^R.175* aloha 2 tvno VI mllanon tenfomn 2C5a nrpra ircnr 


X51408_at 


Hs.380138 


NMJ)01822; chimerin (chimaerln) 1 




Me RQRQft 


iNivi_uu^uyu, cnemoivine \o~a-o mouTj nyanu o 


X5448Q mal at 

AJttOJ 1 1 lei 1 Ol 






X5757Q e. at 






X64072_s_at 


Hs.375957 


NM_00021 1 ; Integrin beta chain, beta 2 precursor 


X674.Q1 f at 




inivi__uuo^/ i, giuiamate aenyarogenase i 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein isoform a NM_182715; 
synaptophysin-Hke protein isoform b 


X73882_at 


Hs.254605 


NMJ)03980; mlcrotubule-associated protein 7 




Lin 070COO 


NM__001829; chloride channel 3 




0707CO 

nS.37Z7bo 


NMJ303344; ubiquitin-conjugating enzyme E2H isoform 1 
NIVM 82697; ubiquitin-conjugating enzyme E2H isoform 2 


Z48605_at 


HS.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 NM_176865; 
NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NM_176869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs.172928 


NM_000088; alpha 1 type I collagen preproprotein 


Table 13. 20 genes for classifier 


Chip acc. # 


UniGene Build 162 


description j 


D89377_at 


Hs.89404 


NM_002449; msh homeo box homolog 2 


J05032_at 


Hs.32393 


NM_001349; aspartyi-tRNA synthetase 


M23178_s_at 


Hs.73817 


NM_002983; chemokine (C-C motif) ligand 3 


M32011_at 


Hs.949 


NMJ)00433; neutrophil cytosolic factor 2 


M69203_s_at 


Hs.75703 


NM_002984; chemokine (C-C motif) ligand 4 precursor 


S77393_at 


Hs.145754 


NMJD16531; Kruppel-like factor 3 (basic) 


U07231_at 


Hs.309763 


NM_002092; G-rich RNA sequence binding factor 1 


U41315_rna1_s_at 






U47414_at 


Hs. 13291 


NM_004354; cyclln G2 


U49352_at 


Hs.414754 


NM_001359; 2,4-dienoyl CoA reductase 1 precursor 


U50708_at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase E1 f beta 
polypeptide precursor NMJ 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U77970_at 


Hs.321164 


NM_002518; neuronal PAS domain protein 2 NM_032235; 


X13334_at 


Hs.75627 


NM_000591; CD14 antigen precursor 


X57579_s_at 






X64072_s_at 


Hs.375957 


NM_00021 1; integrin beta chain, beta 2 precursor 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein Isoform a NNM82715; 
synaptophysin-like protein isoform b 


X73882_at 


Hs.254605 


NM_003980; mlcrotubule-associated protein 7 


X78520_at 


Hs.372528 


NMJ)01829; chloride channel 3 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 Isoform 2 NM_1 76865; 
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NMjl 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NhM 76869; Inorganic 
pyrophosphatase 2 Isoform 1 


Z74615_at 


Hs.1 72928 


NM_000088; alpha 1 type I collagen preproprotein 


Table 14. 10 genes for classifier 


Chip acc. # 


UniGene Build 162 


description 


D89377_at 


Hs.89404 


NM_002449; msh homeo box homolog 2 


S77393_at 


Hs.145754 


NM_016531; KruppeMike factor 3 (basic) 


U41315_rna1_s_at 






U47414_at 


Hs.13291 


NMJ)04354; cyclin G2 


U77970_at 


Hs.321164 


NNL002518; neuronal PAS domain protein 2 NM_032235; 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein isoform a NA/M82715; 
synaptophysin-like protein isoform b 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NNL001829; chloride channel 3 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 NM_1 76865; 
NMjl 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 Isoform 4 NMJ 76869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs.1 72928 


NM_000088; alpha 1 type I collagen preproprotein 


Table 1 5. 32 genes for classifier 


Chip acc. # 


UniGene Build 162 


description 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 precursor 


HG67-HT67JLat 






HG907-HT907_at 






J05032__at 


Hs.32393 


NM_001349; aspartyl-tRNA synthetase 


K01396_at 


Hs.297681 


NM_000295; serine (or cysteine) proteinase inhibitor, ciade A 
(alpha-1 antiproteinase, antitrypsin), member 1 


M16591_s_at 


Hs.89555 


NM_002110; hemopoietic cell kinase isoform p61HCK 


M32011_at 


Hs.949 


NM_000433; neutrophil cytosolic factor 2 


M33195_at 


Hs.433300 


NM_004106; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


M37033_at 


Hs.443057 


NM_000560; CD53 antigen 


M57731_s_at 


Hs.75765 


NM_002089; chemokine (C-X-C motif) ligand 2 


M63262_at 






S77393_at 


Hs.145754 


NM_016531; KruppeWike factor 3 (basic) 


U01833_at 


Hs.81469 


NM_002484; nucleotide binding protein 1 (MinD homolog, E. coli) 


U07231_at 


Hs.309763 


NM_002092; G-rich RNA sequence binding factor 1 ; 


U41315_ma1_s_at 






U47414_at 


Hs.13291 


NM_004354; cyclin G2 


U50708_at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase E1, beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U52101_at 


Hs.9999 


NM_001425; epithelial membrane protein 3 
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1 IT>IOO>l nl 

U74324 - jat 


HS. 90875 


kill • 13 AD « IahUv 

InIvMj02871; KAB-intera cting tactor 


U77970_at 


Hs.321164 


NM_002518; neuronal PAS domain protein 2 NMJ532235; 


US0549__at 


Hs. 236774 


NM_006353; high mobility group nucleosomal binding domain 4 


X13334_at I 


Hs.75627 


NM_000591; CD14 antigen precursor 


Xo44o9_mal_at 






X57o79_jS — at 






X64072_s_at 


Hs.375957 


NM.000211; integrin beta chain, beta 2 precursor 


X681 94_at 


Hs.80919 


NMJ)06754; synaptophysin-like protein isofbrm a NM_182715; 
synaptophysin-like protein isofbrm b 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NMJJ01829; chloride channel 3 


X95632_s_at 


Hs.387906 


NMJ)05759; abl-interactor 2 




1 i— 0707CO 

HS.37270O 


IMM_003344; ubiquitin-conjugating enzyme E2H isofonm 1 
NM_1 82697; ubiquitin-conjugating enzyme E2H isofbrm 2 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 Isofonm 2 NM_1 76865; 
NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NMJI76867; 
Inorganic pyrophosphatase 2 isoform 4 NM_1 76869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs.1 72928 


NM_000088; alpha 1 type I collagen preproprotein 



Recurrence predictor 

We furthermore tested an outcome predictor able to identify the likely presence or absence 
of recurrence in patients with superficial Ta tumours (see Table 16). 

5 

Table 16. Patient disease course information - recurrence vs. no recurrence 
From the hierarchical cluster analysis of the tumour samples we found that the tumours with 
a high recurrence frequency were separated from the tumours with low recurrence 
frequency. To study this further we profiled two groups of Ta tumours- 15 tumours with low 
10 recurrence frequency and 16 tumours with high recurrence frequency. To avoid Influence 
from other tumour characteristics we only used tumours that showed the same growth 
pattern and tumours that showed no sign of concomitant carcinoma in situ. Furthermore, the 
tumours were all primary tumours. The tumours used for identifying genes differentially 
expressed in recurrent and non-recurrent tumours are listed in Table 16 below. 

15 

Table 16 Disease course information of all patients involved. 



Group 


Patient 


Tumour (date) . v 


Patterri^ > • ;•, 4 


Qardnprna In situ 


Time to recurrence 
• : j 


A 


968-1 


Tagr2 


Papillary 


no 


27 month 


A 


928-1 


Tagr2 


Papillary 


no 


38 month. 


A 


934-1 


Ta gr2 (220798) 


Papillary 


no 




A 


709-1 


Ta gr2 (210798) 


Papillary 


no 




A 


930-1 


Ta gr2 (300698) 


Papillary 


no 




A 


524-1 


Tagr2 (201095) 


Papillary 


no 
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455-1 


Ta ar2 fOfiOfiQR^ 


Panillnrv 


no 






370-1 




Panillarv 


no 




A 

r\ 


O 1 w— 1 


Ta nr2 f 031 0Q71 


Panlllarv 


no 

IIU 




A 


1146-1 


Ta ar2 f231 1 


Panlllarv/ 
~dpiiiai y 


nn 






1161-1 


Ta ar2 M 01 2991 


Mixed 


nn 

1 IU 






1006-1 


Ta ar2 (231 1 9S1 


Panillarv/ 


i iu 




A 


942-1 


Ta Qr2 


Panillarv 


no 


OA month 
£Pt 1 1 IUI 1 U I . 


A 


1060-1 


Ta gr2 


Panillarv 


no 


*\R mnnfh 

uO 1 IIUJ IUI* 


A 


1255-1 


Ta gr2 


Panillarv 

r o^Ji ileal y 


no 


OA mnnth 

£.*T 1 1 IUIIU 1. 


B 


441-1 


Ta gr2 


Panillarv 
• ct|/nicu y 


nn 
i iu 


R mnnth 

U 1 1 IUI IU 1. 


B 


780-1 


Ta gr2 


Panillarv 


nn 

IIU 


0 month 

II IUI IUI* 


B 


815-2 


Ta gr2 


Panillarv 


nn 

IIU 


R month 
O IIIUIIUI. 


B 


829-1 


Ta gr2 


Panillarv 


nn 

IIU 


A month 
*r IIIUIIUI. 


B 


861-1 


Ta gr2 


Panillarv 


nn 


/i month 


B 


925-1 


Ta gr2 


Panillarv 


nn 
IIU 


u rnonui. 


B 


1008-1 


Ta gr2 


Panillarv 
i^djJtildl y 


no 


U IMUi IUI. 


B 


1086-1 


Ta gr2 


Panlllarv 
i dpllldJ y 


nn 

IIU 


o rnonin. 


B 


1105-1 


Ta gr2 


Panlllarv 
i dLJiiidi y 


nn 

IIU 


Q nnnnfh 

o rnonui. 


B 


1145-1 


Tagr2 


Papillary 


no 


4 month. 


B 


1327-1 


Tagr2 


Papillary 


no 


5 month. 


B 


1352-1 


Tagr2 


Papillary 


no 


6 month. 


B 


1379-1 


Tagr2 


Papillary 


no 


5 month. 


B 


533-1 


Tagr2 


Papillary 


no 


4 month. 


B 


679-1 


Ta gr2 


Papillary 


no 


4 month. 


B 


692-1 


Tagr2 


Papillary 


no 


5 month. 



Group A: Primary tumours from patients with no recurrence of the disease for 2 years. 
Group B: Primary tumours from patients with recurrence of the disease within 8 months. 

5 Supervised learning prediction of recurrence 

In this part of the work we identified genes differentially expressed between non-recurring 
and recurring tumours. Cross-validation and prediction was performed as previously de- 
scribed, except that genes are selected based on the value of the Wilcoxon statistic for dif- 
ference between the two groups. 

10 

Prediction performance 

The prediction performance was tested using from 1-200 genes in the cross-validation loops. 
Figure 11 shows that the lowest error rate (8 errors) is obtained in e.g. the cross-validation 
model using from 39 genes. Based on this we selected this cross-validation model as our 
15 final predictor. The results of the predictions from the 39 gene cross-validation loops are 
listed in Table 17. The predictor misclassified four of the samples in each group and in one 
of the predictions the difference in the distances between the two group means is below the 
5% difference limit as described above. 
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The probability of misclassifying 8 or less arrays by a random classification is 0.0053. 

Table 17. Recurrence prediction results of 39 gene cross-validation loops. 
Group A: Primary tumours from patients with no recurrence of the disease for 2 years. Group 
5 B: Primary tumours from patients with recurrence of the disease within 8 months. Prediction, 
0=no recurrence, 1 =recurrence. 



Group 


Patlertt' ~ r ' 


Tumour (date) 


Prediction 


Error 


Prediction strength 


A 


968-1 


Tagr2 


0 




0.19 


A 


928-1 


Tagr2 


0 




0.49 


A 


934-1 


Ta gr2 (220798) 


0 




1.73 


A 


709-1 


Tagr2 (210798) 


0 




0.45 


A 


930-1 


Ta gr2 (300698) 


0 




0.82 


A 


524-1 


Tagr2 (201095) 


0 




0.14 


A 


455-1 


Ta gr2 (060695) 


1 


* 


0.68 


A 


370-1 


Tagr2 (100195) 


0 




0.32 


A 


810-1 


Tagr2 (031097) 


0 




0.45 


A ; 


1146-1 


Tagr2 (231199) 


0 




0.98 


A 


1161-1 


Tagr2 (101299) 


0 




0.03 


A | 


1006-1 


Tagr2 (231198) 


1 


* 


1.57 


A 


942-1 


Tagr2 


0 




0.31 


A 


1060-1 


Tagr2 


1 


* 


0.81 


A 


1255-1 


Tagr2 


1 


* 


0.71 


B 


441-1 


Ta gr2 


1 




1.03 


B 


780-1 


Tagr2 


1 




0.37 


B 


815-2 


Tagr2 


1 




0.35 


B 


829-1 


Tagr2 


1 




0.75 


B 


861-1 


Tagr2 


0 


* 


2.55 


B 


925-1 


Ta gr2 


1 




0.78 


B 


1008-1 


Ta gr2 


0 


• 


0.12 


B 


1086-1 


Ta gr2 


0 


* 


0.51 


B 


1105-1 


Tagr2 


1 




0.37 


B 


1145-1 


Tagr2 


1 




0.44 


B 


1327-1 


Tagr2 


1 




1.96 


B 


1352-1 


Tagr2 


0 


* 


0.97 


B 


1379-1 


Tagr2 


1 




0.67 


B 


533-1 


Tagr2 


1 




0.31 


B 


679-1 


Tagr2 


1 




0.82 


B 


692-1 


Ta gr2 


! 1 




0.45 



10 The optimal number of genes in cross-validation loops was found to be 39 (75% of the sam- 
ples were correct classified, p<0.006) and from this we selected those 26 genes that were 
used in at least 75% of the cross-validation loops to constitute our final recurrence predictor. 
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Consequently, this set of genes is to be used for predicting recurrence in independent sam- 
ples. We tested the strength of the predictive genes by permutation analysis, see Table 18. 
We selected the genes used in at least 29 of the 31 cross-validation loops to constitute our 
final recurrence prediction model. The expression pattern of those 26 genes is shown in fig. 
5 12. 



Table 18. The 26 genes that we find optimal for recurrence prediction. 





: T Untgene :r ^. 
build 468 


Description \ 


Nunlber 




AF006041_at 


Hs.336916 


NM_001350; death-associated protein 6 


31 


0.054 (161-7) 


D21337_at 


Hs.408 


NM_001847; type IV alpha 6 collagen isofbrm A precursor 
NM_033641 ; type IV alpha 6 collagen isoform B precursor 


31 


0.058 (160-6) 


D49387_at 


Hs.294584 


NM_012212; NADP-dependent leukotriene B4 12- 
hydroxydehydrogenase 


31 


0.118(313-8) 


D64154_at 


Hs.90107 


NM_007002; adhesion regulating molecule 1 precursor 
NM_1 75573; adhesion regulating molecule 1 precursor 


31 


0.078 (165-9) 


D83780_at 


Hs.437991 


NM_014846; KIAA0196 gene product 


31 


0.094(159-4) 


D87258_at 


Hs.75111 


NM_002775; protease, serine, 11 


30 


0.112(168-11 


D87437_at 


Hs.43660 


NM_014837; chromosome 1 open reading frame 16 


31 


0.058 (160-6) 


HG1879-HT1919_at 






31 


0.122 (314-7) 


HG3076-HT3238_s_at 






31 


0.080 (309-17 


HG511-HT511_at 






31 


0.348 (319-2) 


L34155_at 


Hs.83450 


NM_000227; laminin alpha 3 subunit precursor 


31 


0.122 (314-7) 


L38928_at 


Hs.1 18131 


NM_006441; 5,10-methenyltetrahydrofolate synthetase (5- 
formyltetrahydrofolate cyclo-ligase) 


29 


0.348 (319-2) 


L49169_at 


Hs.75678 


NM_006732; FBJ murine osteosarcoma virai oncogene 
homolog B 


31 


0.108 (155-2) 


M16938_s_at 


Hs.820 


NM_004503; homeo box C6 isoform 1 NM_1 53693; ho- 
meo box C6 isoform 2 


29 


0.09 (170-16) 


M63175_at 


Hs.295137 


NM_001 144; autocrine motility factor receptor isoform a 
NM_1 38958; autocrine motility factor receptor isoform b 


29 


0.098(308-18 


M64572_at 


Hs.405666 


NM_002829; protein tyrosine phosphatase, non-receptor 
type 3 


31 


0.064 (305-31 


M98528_at 


Hs.79404 


NM_014392; DNA segment on chromosome 4 (unique) 
234 expressed sequence 


31 


0.122 (314-7) 


U21858_at 


Hs.60679 


NMJ>03187; TBP-associated factor 9 NM.016283; adre- 
nal gland protein AD-004 


31 


0.122 (314-7) 


U45973_at 


Hs. 178347 


NM_016532; skeletal muscle and kidney enriched inositol 
phosphatase isoform 1 NM_1 30766; skeletal muscle and 
kidney enriched inositol phosphatase Isoform 2 


31 


0.094(310-14 


U58516_at 


Hs.3745 


NM_G05928; milk fat globule-EGF factor 8 protein 


29 


0.100 (175-28 


U62015_at 


Hs.8867 


NM_001554; cysteine-rich, angiogenic inducer, 61 


31 


0.106 (169-13 


U66702_at 


Hs.74624 


NM_002847; protein tyrosine phosphatase, receptor type, 
N polypeptide 2 isoform 1 precursor NM_1 30842; protein 


31 


0.146 (149-1) 
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tyrosine phosphatase, receptor type, N polypeptide 2 
isoform 2 precursor NM_1 30843; protein tyrosine phos- 
phatase, receptor type, N polypeptide 2 isoform 3 precur- 
sor 






U70439_s_at 


Hs.84264 


NM_006401 ; acidic (leuclne-rich) nuclear phosphoproteln 
32 family, member B 


30 


0.08(309-17) 


U94855_at 


Hs.381255 


NMJ)03754; eukaryotic translation initiation factor 3, 
subunit 5 epsilon, 47kDa 


30 


0.092(311-12) 


X63469_at 


Hs.77100 


NM_002095; general transcription factor HE, polypeptide 
2, beta 34kDa 


31 


0.092(311-12) 


Z23064_at 


Hs.380118 


NM_G02139; RNA binding motif protein, X chromosome 


30 


0.066 (307-24) 



Number: Number of times the gene has been used in a cross-validation ioop. Test: The 
numbers in parenthesis are the value W of the Wilcoxon test statistic for no difference 
between the two groups together with the number N of genes for which the Wilcoxon test 
5 statistic is bigger than or equal to the value W. The test value is obtained from 500 
permutations of the arrays. In each permutation we form new pseudogroups where both of 
the pseudogroups have the same proportion of arrays from the two original groups. For each 
permutation we count the number of genes for which the Wilcoxon test statistic based on the 
pseudogroups is bigger than or equal to W, and the test value is the proportion of the 
10 permutations for which this number is bigger than or equal to N. Thus the test value 
measures the significance of the observed value W. Consequently, for most of our selected 
genes we only find as least as good predictive genes in about 10% of the formed 
pseudogroups. 

15 We present data on expression patterns that classify the benign and muscle-invasive blad- 
der carcinomas. Furthermore, we can identify subgroups of bladder cancer such as Ta tu- 
mours with surrounding CIS, Ta tumours with a high probability of progression as well as 
recurrence, and T2 tumours with squamous metaplasia. As a novel finding, the matrix re- 
modelling gene cluster was specifically expressed in the tumours having the worst progno- 

20 sis, namely the T2 tumours and tumours surrounded by CIS. For some of these genes new 
small molecule inhibitors already exist ( Kerr et al. 2002), and thus they form drug targets. At 
present it is not possible clinically to identify patients who will experience recurrence and not 
recurrenc, but it would be a great benefit to both the patients and the health system by re- 
ducing the number of unnecessary control examinations in bladder tumour patients. To de- 

25 termine the optimal gene-set for separating non-recurrent and recurrent tumours, we again 
applied a cross-validation scheme using from 1-200 genes. We determined the optimal 
number of genes in cross-validation loops to be 39 (75% of the samples were correct classi- 
fied, p<0.01, Figure 11) and from this we selected those 26 genes (Figure 12) that were 
used in at least 75% of the cross-validation loops to constitute our final recurrence predictor. 

30 Consequently, this set of genes is to be used for predicting recurrence in independent sam- 
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pies. We tested the strength of the predictive genes by performing 500 permutations of the 
arrays. This revealed that for most of our predictive genes we would only in a small number 
of the new pseudo-groups obtain at least as good predictors as in the real groups. 

5 Biological material 

66 bladder tumour biopsies were sampled from patients following removal of the necessary 
amount of tissue for routine pathology examination. The tumours were frozen immediately 
after surgery and stored at -80°C in a guanidinium thiocyanate solution. All tumours were 
graded according to Bergkvist et ai 1965 and re-evaluated by a single pathologist. As nor- 
10 mal urothelial reference samples we used a pool of biopsies (from 37 patients) as well as 
three single bladder biopsies from patients with prostatic hyperplasia or urinary incontinence. 
Informed consent was obtained in all cases and protocols were approved by the local scien- 
tific ethical committee. 

1 5 RNA purification and cRNA preparation 

Total RNA was isolated from crude tumour biopsies using a Polytron homogenisator and the 
RNAzol B RNA isolation method (WAK-Chemie Medical GmbH). 10 \ig total RNA was used 
as starting material for the cDNA preparation. The first and second strand cDNA synthesis was 
performed using the Superscript Choice System (Life Technologies) according to the manu- 

20 facturers instructions except using an oligo-dT primer containing a T7 RNA polymerase pro- 
moter site. Labelled cRNA was prepared using the BioArray High Yield RNA Transcript Label- 
ling Kit (Enzo). Biotin labelled CTP and UTP (Enzo) were used in the reaction together with 
unlabeled NTP's. Following the IVT reaction, the unincorporated nucleotides were removed 
using RNeasy columns (Qiagen). 

25 

Array hybridisation and scanning 

15 ^g of cRNA was fragmented at 94°C for 35 min in a fragmentation buffer containing 40 
mM Tris-acetate pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to hybridisation, the frag- 
mented cRNA in a 6xSSPE-T hybridisation buffer (1 M NaCI, 10 mM Tris pH 7.6, 0.005% 

30 Triton), was heated to 95°C for 5 min and subsequently to 45°C for 5 min before loading onto 
the Affymetrix probe array cartridge (HuGeneFL). The probe array was then incubated for 16 
h at 45°C at constant rotation (60 rpm). The washing and staining procedure was performed 
In the Affymetrix Fluidics Station. The probe array was exposed to 10 washes in 6xSSPE-T 
at 25°C followed by 4 washes in 0.5xSSPE-T at 50°C. The biotinylated cRNA was stained 

35 with a streptavidin-phycoerythrin conjugate, final concentration 2 \igl\s\ (Molecular Probes, 
Eugene, OR) in 6xSSPE-T for 30 min at 25°C followed by 10 washes in 6xSSPE-T at 25°C. 
The probe arrays were scanned at 560 nm using a confocal laser-scanning microscope 
(Hewlett Packard GeneArray Scanner G2500A). The readings from the quantitative scanning 
were analysed by the Affymetrix Gene Expression Analysis Software. An antibody amplifica- 
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tion step followed using normal goat IgG as blocking reagent, final concentration 0.1 mg/ml 
(Sigma) and biotinylated anti-streptavidin antibody (goat), final concentration 3 ng/ml (Vector 
Laboratories). This was followed by a staining step with a streptavidin-phycoerythrin conju- 
gate, final concentration 2 jig/^1 (Molecular Probes, Eugene, OR) in 6xSSPE-T for 30 min at 
5 25°C and 10 washes in 6xSSPE-T at 25°C. The arrays were then subjected to a second 
scan under similar conditions as described above. 

Class discovery using hierarchical clustering 

All microarray results were scaled to a global intensity of 150 units using the Affymetrbc Ge- 

10 neChip software. Other ways of array normalisation exist (Li and Hung 2001), however, us- 
ing the dCHIP approach did not change the expression profiles of the obtained classifier 
genes in this study (results not shown). For hierarchical cluster analysis and molecular classi- 
fication procedures we used expression level ratios between tumours and the normal urothe- 
lium reference pool calculated using the comparison analysis implemented in the Affymetrix 

15 GeneChip software. In order to avoid expression ratios based on saturated gene-probes, we 
used the antibody amplified expression-data for genes with a mean Average Difference 
value across all samples below 1000 and the non-amplified expression-data for genes with 
values equal to or above 1000 in mean Average Difference value across all samples. Con- 
sequently, gene expression levels across all samples were either from the amplified or the 

20 non-amplified expression-data. We applied different filtering criteria to the expression data in 
order to avoid including non-varying and very low expressed genes in the data analysis. 
Firstly, we selected only genes that showed significant changes in expression levels com- 
pared to the normal reference pool in at least three samples. Secondly, only genes with at 
least three "Present" calls across all samples were selected. Thirdly, we eliminated genes 

25 varying less than 2 standard deviations across all samples. The final gene-set contained 
1767 genes following filtering. Two-way hierarchical agglomerative cluster analysis was per- 
formed using the Cluster software 25 . We used average linkage clustering with a modified 
Pearson correlation as similarity metric. Genes and arrays were median centred and normal- 
ised to the magnitude of 1 prior to cluster analysis. The TreeView software was used for 

30 visualisation of the cluster analysis results (Eisen et al. 1998). Multidimensional scaling was 
performed on median centred and normalised data using an implementation in the SPSS 
statistical software package. 

Tumour stage classifier 

35 We based the classifier on the log-transformed expression level ratios. For these trans- 
formed values we used a normal distribution with the mean dependent on the gene and the 
group (Ta, T1, and T2, respectively) and the variance dependent on the gene only. For each 
gene we calculated the variation within the groups (W) and the three variations between two 
groups (B(Ta/T1), B(Ta/T2), B(T1/T2)) and used the three ratios B/W to select genes. We 
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selected those genes having a high value of B(Ta/T1)/W, those genes having a high value 
of B(Ta/T2)/W, and those genes with a high value of B(T1/T2)/W. To classify a sample, we 
calculated the sum over the genes of the squared distance from the sample value to the 
group mean, standardised by the variance. Thus, we got a distance to each of the three 
5 groups and the sample was classified as belonging to the group in which the distance was 
smallest. When calculating these distances the group means and the variances were esti- 
mated from all the samples in the training set excluding the sample being classified. 



Recurrence prediction using a supervised learning method 

10 Average Difference values were generated using the Affymetrix GeneChip software and all 
values below 20 were set to 20 to avoid very low and negative numbers. We only included 
genes that had a "Present" call in at least 7 samples and genes that showed intensity varia- 
tion (Max-Min>100, Max/Min>2). The values were log transformed and rescaled. We used a 
supervised learning method essentially as described ( Shipp et al. 2002). Genes were se- 

15 lected using t-test statistics and cross-validation and sample classification was performed as 
described above. 



Immunohistochemistry 

Tumour tissue microarrays were prepared essentially as described (Kononen et al. 1998), 
20 with four representative 0.6 mm paraffin cores from each study case. Immunohistochemical 
staining was performed using standard highly sensitive techniques after appropriate heat- 
induced antigen retrieval. Primary polyclonal goat antibodies against Smad 6 (S-20) and 
cyclin G2 (N-19) were from Santa Cruz Biotechnology. Antibodies to p53 (monoclonal DO-7) 
and Her-2 (polyclonal antl-c-erbB-2) were from Dako A/S. Ki-67 monoclonal antibody (MIBI) 
25 was from Novocastra Laboratories Ltd. Staining intensity was scored at four levels, Nega- 
tive, Weak, Moderate and Strong by an experienced pathologist who considered both colour 
intensity and number of stained cells, and who was unaware of array results. 

EXAMPLE 3 

30 A molecular classifier detects carcinoma in situ expression signatures in tumors and 
normal urothelium of the bladder. 

Clinical samples 

Bladder tumour samples were obtained directly from surgery following removal of tissue for 
routine pathological examination. The samples were immediately submerged in a guadinium 
35 thiocyanate solution for RNA preservation and stored at -80° C. Informed consent was 
obtained in all cases, and the protocols were approved by the scientific ethical committee of 
Aarhus County. Samples in the No-CIS group were selected based on the following criteria: 
a) Ta tumours with no CIS in selected site biopsies in all visits; b) no previous muscle 
invasive tumour. Samples in the CIS group were selected based on the criteria: a) Ta or T1 
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tumours with CIS in selected site biopsies in any visit (preferable Ta tumours with CIS in the 
sampling visit); b) no previous muscle invasive tumours. Normal biopsies were obtained from 
individuals with prostatic hyperplasia or urinary incontinence. CIS and "normal" biopsies 
were obtained from cystectomy specimens directly following removal of the bladder. A grid 
5 was placed in the bladder for orientation and biopsies were taken from 8 positions covering 
the bladder surface. At each position, three biopsies were taken - two for pathologic 
examination and one in between these for RNA extraction for microarray expression 
profiling. The samples for RNA extraction were immediately transferred to the guadinium 
thiocyanate solution and stored at -80° C until use. Samples used for RNA extraction were 
10 assumed to have CIS if CIS was detected in both adjacent biopsies. The "normal* samples 
were assumed to be normal if both adjacent biopsies were normal. 

cRNA preparation, array hybridisation and scanning 

Purification of total RNA, preparation of cRNA from cDNA and hybridisation and scanning 
15 were performed as previously described (Dyrskjot et al. 2003). The labelled samples were 
hybridised to Affymetrix U1 33A GeneChips. 

Expression data analysis 

Following scanning all data were normalised using the RMA normalisation approach in the 
20 Bioconductor Affy package to R. Variation filters were applied to the data to eliminate non- 
varying and presumably non-expressed genes. For gene-set 1 this was done by only 
including genes with a minimum expression above 200 in at least 5 samples and genes with 
max/min expression intensities above or equal to 3. The filtering for gene-set 2 including only 
genes with a minimum expression of 200 in at least 3 samples and genes with max/min 
25 expression intensities above or equal to 3. Average linkage hierarchical cluster analysis was 
carried out using the Cluster software with a modified Pearson correlation as similarity metric 
(Eisen et al. 1998). We used the TreeView software for visualisation of the cluster analysis 
results (Eisen et al. 1998). Genes were log-transformed, median centred and normalised to 
the magnitude of 1 before clustering. We used GeneCluster 2.0 (htto://www- 
30 qenome.wi.mit.edu/cancer/software/genecluster2/Qc2.htmn for the supervised selection of 
markers and for permutation testing. The algorithms used in the software are based on 
(Golub et al. 1999, Tamayo et al. 1999). Classifiers for CIS detection were built using the 
same methods as described previously (Dyrskjot et al. 2003). 

35 Gene expression profiling 

We used high-density oligonucleotide mlcroauays for gene expression profiling of 
approximately 22,000 genes in 28 superficial bladder tumour biopsies (13 tumours with 
surrounding CIS and 15 without surrounding CIS) and in 13 invasive carcinomas. See table 
19 for patient disease course descriptions. Furthermore, expression profiles were obtained 
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from 9 normal biopsies and from 10 biopsies from cystectomy specimens (5 histologically 
normal biopsies and 5 biopsies with CIS). 

Table 19 Clinical data on patient disease courses and results of molecular CIS classification 



Sample 


Patient 0 


Previous 


Tumour 


Subsequent 


CIS 0 


CIS dassifi 


group 8 




tumours 


analysed 


tumours 




d 


1 


1060-1 




Tagr2 


2Ta 


No 


No CIS 


1 


1146-1 




Tagr2 




No 


No CIS 


1 


1216-1 




Tagr2 




No 


No CIS 


1 


1303-1 




Tagr2 




No 


No CIS 


1 


524-1 




Ta gr2 




No 


No CIS 


1 


692-1 




Ta gr2 


2Ta 


No 


No CIS 


1 


1264-1 




Ta gr3 


20 Ta 


No 


No CIS 


1 


1350-1 




Tagr3 


1 Ta 


No 


No CIS 


1 


1354-1 




Ta gr3 


11 T1 


No 


No CIS 


1 


775-1 




Ta gr3 


1 Ta 


No 


No CIS 


1 


1066-1 




Tagr3 


1Ta 


No 


No CIS 


1 


1276-1 




Tagr3 


2T1 


No 


No CIS 


1 


1070-1 




Tagr3 


1 Ta 


No 


No CIS 


1 


989-1 




Tagr3 




No 


No CIS 


1 


1482-1 




Tagr3 


20 Ta 


No 


CIS 


2 


1345-2 


1 T1 


Ta gr3 




Sampling visit 


CIS 


2 


1062-2 




Ta gr3 


1T1 


Sampling visit 


CIS 


2 


956-2 




Ta gr3 


1Ta 


Sampling visit 


CIS 


2 


320-7 


1Ta,2T1 


Ta gr3 


2Ta 


Sampling visit 


CIS 


2 


1330-1 




Tagr3 




Sampling visit 


CIS 


2 


602-8 


5Ta 


Tagr3 


3Ta 


Sampling visit 


CIS 


2 


763-1 




Tagr2 


14 Ta 


Sampling visit 


CIS 


2 


1024-1 




T1 gr3 


2Ta, 1T1 


Sampling visit 


CIS 


2 


1182-1 




Ta gr3 


7Ta 


Subsequent visit 


CIS 


2 


1093-1 




Ta gr3 


4Ta, 1T1 


Subsequent visit 


CIS 


2 


979-1 




Tagr3 




Sampling visit 


CIS 


2 


1337-1 




T1 gr3 




Sampling visit 


CIS 


2 


1625-1 




Tagr2 




Sampling visit 


CIS 


3 
3 


1015-1 
1337-1 




T3b gr4 
T4a gr3 




No 

Sampling visit 


- 


3 


1041-1 




T4bgr3 




No 




3 


1044-1 




T4bgr3 




ND 




3 


1055-1 


1 Tagr2 


T3a gr3 




No 




3 


1109-1 




T2gr3 


1 T2-4 


No 




3 


1124-1 




T4agr3 


2T2-4 


No 




3 


1154-1 




T3a gr3 


1 Ta, 1 T2-4 


No 




3 


1167-1 


1 T2-4 


T3b gr4 


2T2-4 


ND 




3 


1178-1 




T4b gr3 




ND 




3 


1215-1 




T4bgr3 




ND 




3 


1271-1 




T3b gr4 




No 




3 


1321-1 


1T1 


T3b gr? 




ND 
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a The tumour groups involved were TCC without CIS (1), TCC with CIS (2) and invasive TCC 



b The numbers indicate the patient number followed by the clinic visit number. 

0 CIS in selected site biopsies in previous, present or subsequent visits to the clinic. ND: not 



d Molecular classification of the samples using 25 genes in cross-validation loops. 
Hierarchical cluster analysis 

Following appropriate normalisation and expression intensity calculations we selected those 

10 genes that showed high variation across the 41 TCC samples for further analysis. The 
filtering produced a gene-set consisting of 5,491 genes (gene-set 1) and two-way 
hierarchical cluster analysis was performed based on this gene-set. The sample clustering 
showed a separation of the three groups of samples with only few exceptions (Fig. 14a). 
Superficial TCC with surrounding CIS clustered in the one main branch of the dendrogram, 

15 while the superficial TCC without CIS and the invasive TCC clustered in two separate sub- 
branches in the other main branch of the dendrogram. The only exceptions were that the 
invasive TCC samples 1044-1 and 1 124-1 clustered in the CIS group and two TCC with CIS 
clustered in the invasive group (samples 1330-1 and 956-2). The only TCC without CIS that 
clustered in the CIS group was sample 1482-1. The distinct clustering of the tumour groups 

20 indicated a large difference in gene expression patterns. 

Hierarchical clustering of the genes (Fig. 14c) identified large clusters of genes characteristic 
for the each tumour phenotype. Cluster 1 showed a cluster of genes down-regulated in 
cystectomy biopsies, TCC with adjacent CIS and in some invasive carcinomas (Fig. 14c). 
There is no obvious functional relationship between the genes in this cluster. Cluster 2 

25 showed a tight cluster of genes related to immunology and cluster 3 contained mostly genes 
expressed in muscle and connective tissue. Expression of genes in this cluster was 
observed in the normal and cystectomy samples, in a fraction of the TCC with CIS and in the 
invasive tumours. Cluster 4 contained genes up-regulated in the cystectomy biopsies, TCC 
with adjacent CIS and in invasive carcinomas (Fig. 14c). This cluster includes genes 

30 involved in cell cycle regulation, cell proliferation and apoptosis. However, for most of the 
genes in this cluster there is not apparent functional relationship either. Comparisons of 
chromosomal location of the genes in the clusters revealed no correlation between the 
observed gene clusters and chromosomal position of the identified genes. A positive 
correlation could have Indicated chromosomal loss or gain or chromosomal inactivation by 

35 e.g. methylation of common promoter regions. 



To analyse the impact of surrounding CIS lesions further we used the 28 superficial tumours 
only, and created a new gene set consisting of 5,252 varying genes (gene-set 2). 
Hierarchical cluster analysis of the tumour samples (Figure 13b) based on the new gene-set 
separated the samples according to the presence of CIS in the surrounding urothelium with 



(3). 



5 



determined. 
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only 1 exception (P< 0.000001, x 2 -test). Sample 1482-1 clustered in the TCC with CIS 
group, however, no CIS has been detected in selected site biopsies during routine 
examinations of this patient. Tumour samples 1182-1 and 1093-1 did not have CIS in 
selected site biopsies in the same visit as the profiled tumour but showed this in later visits. 
5 However, the profile of these two superficial tumour samples already showed the adjacent 
CIS profile. 

Marker selection 

To delineate the tumours with surrounding CIS from the tumours without CIS we used t-test 
10 statistics to select the 50 most up-regulated genes in each group (Figure 15a). Permutation 
of the sample labels 500 times revealed that the 50 genes up-regulated in the CIS-group are 
highly significant differentially expressed and unlikely to find by chance, as all markers were 
significant on a 5% confidence level. Consequently, in 500 random datasets it was only 
possible to select as good genes in less than 5% of the datasets. The 50 genes up-regulated 
15 in the no-CIS group showed a poorer performance in the permutation tests, as these were 
not significant on a 5% confidence level. See Table 20 for details. The relative expression of 
these 100 genes is 9 normal and 10 biopsies from cystectomies with CIS are shown in figure 
15b. The no-CIS profile was found in all of the normal samples. However, all histologically 
normal samples adjacent to the CIS lesions as well as the CIS biopsies showed the CIS 
20 profile. 



Table 20The best 100 markers 



(U133 array) 

m ■ •' Lftri LI 


'•Class-;*!: i 




mm 


5% 


fPemt 1 

10% -v. 




* Re(?eq;cI^cffptlori 


221204_s_at 


no_CIS 


3.74 


5.12 


4.61 


4.33 


Hs.326444 


NM_018058; cartilage acidic 
protein 1 


205927_s_at 


no_CIS 


3.67 


4.53 


3.98 


3.73 


Hs.1355 


NM_001910; cathepsin E iso- 
form a preproprotein 
NM_148964; cathepsin E Iso- 
form b preproprotein 


210143_at 


no__CIS 


3.35 


4.03 


3.73 


3.45 


Hs.188401 


NM_007193; annexln A10 


204540_at 


no_CIS 


! 3.15 


3.87 


3.51 


3.32 


Hs.433839 


NM_001958; eukaryotlc transla- 
tion elongation factor 1 alpha 2 


214599_at 


no_CIS 


3.02 


3.75 


3.37 


3.14 


Hs.1 57091 


NM_005547; involucrln 


203649_s_at 


no_CIS 


2.84 


3.63 


3.20 


3.00 


Hs.76422 


NM_000300; phospholipase A2 t 
group IIA (platelets, synovial 
fluid) 


203980_at 


no_CIS 


2.74 


3.47 


3.12 


2.89 


Hs.391561 


NMJ)01442; tatty acid binding 
protein 4, adipocyte 


2G9270_at 


no_CIS 


2.39 


3.38 


3.10 


2.85 


Hs.436983 


NMJ500228; laminln subunlt 
beta 3 precursor 


206658_at 


no_CIS 


2.35 


3.37 


3.05 


2.78 


Hs.284211 


NM_030570; uroplakin 3B iso- 
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form a NM_1 82683; uroplakln 
3B Isoform c NM_1 82684; uro- 
plakln 3B isoform b 


220779_at 


no_CIS 


2.35 


3.33 


2.97 


2.73 


Hs.149195 


NM_016233; peptidylarginlne 
deiminase type III 


216971_s_at 


no_CIS 


2.28 


3.29 


2.91 


2.71 


Hs.79706 


NMJ)00445; plectin 1, Interme- 
diate filament binding protein 
500kDa 


206191_at 


no_CIS 


2.25 


3.24 


2.86 


2.68 


Hs.47042 


NM_001248; ectonucleoslde 
triphosphate diphosphohy- 
drolase 3 


218484_at 


no_CIS 


2.18 


3.20 


2.81 


2.62 


Hs.221447 


NM_020142; NADH:ubiquinone 
oxidoreductase MLRQ subunit 
homolog 


221854_at 


no_CIS 


2.1 


3.19 


2.80 


2.60 


Hs.313068 


NM_000299; plakophilin 1 


203792_x_at 


no_CIS 


2.02 


3.16 


2.74 


2.55 


Hs.371617 


NM_007144; ring finger protein 
110 


207862_at ! 


no_CIS 


2.01 


3.16 


2.72 


2.52 


Hs.379613 


NMJ)06760; uroplakln 2 


218960_at 


no__CIS 


1.93 


3.14 


2.65 


2.47 


Hs.414005 


NM_019894; transmembrane 
protease, serine 4 isoform 1 
NfVM 83247; transmembrane 
protease, serine 4 isoform 2 


203009_at 


no_CIS 


1.93 


3.12 


2.62 


2.45 


Hs. 155048 


NM_005581; Lutheran blood 
group (Auberger b antigen 
included) 


204508_s_at 


no_CIS 


1.88 


3.10 


2.60 


2.42 


Hs.279916 


NM_017689; hypothetical pro- 
tein FU20151 


211692_s_at 


no_CIS 


1.87 


3.06 


2.58 


2.39 


Hs.87246 


NM_014417; BCL2 binding 
component 3 


206465_at 


no_CIS 


1.86 


3.04 


2.54 


2.38 


Hs.277543 


NM_015162; lipidosln 


2Q6122_at 


nq_CIS 


1.85 


2.92 


2.52 


2.36 


Hs.95582 


NM_006942; SRY-box 15 


206393_at 


no_CIS 


1.83 


2.89 


2.49 


2.33 


Hs.83760 


NM_003282; troponin I, skeletal, 
fast 


214639_s_at 


no_CIS 


1.79 


2.87 


2.49 


2.30 


Hs.67397 


NM_G05522; homeobox A1 
protein Isoform a NIVM 53620; 
homeobox A1 protein isoform b 


214630_at 


no_CIS 


1.79 


2.84 


2.44 


2.28 


Hs.1 84927 


NM_000497; cytochrome P450, 
subfamily XIB (steroid 11-beta- 
hydroxylase), polypeptide 1 
precursor 


204465_s_at 


no_CIS 


1.77 


2.81 


2.42 


2.27 


Hs.76888 


NM_004692; NM_032727; 
intemexin neuronal Intermediate 
filament protein, alpha 


204990_s_at 


no_CIS 


1.76 


2.79 


2.41 


2.24 


Hs.85266 


NMJD00213; integrin, beta 4 


205453_at 


no_CIS 


1.75 


2.77 


2.39 


2.22 


Hs.290432 


NM_002145; homeo box B2 


215812_s_at 


no_CIS 


1.74 


2.77 


2.37 


2.20 


Hs.499113 


NM_018058; cartilage acidic 
protein 1 
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217040_x_at 


no_CIS 


1.74 


2.75 


2.36 


2.18 


Hs.95582 


NM_001910; cathepsin E Iso- 
form a preproprotein 
NMJ148964; cathepsin E Iso- 
fonm b preproprotein 


203759_at 


np_CIS 


1.73 


2.75 


2.34 


2.17 


t i_ 7COCO 

Hs.75268 


NMj 007193; annexln A10 


211 UU2_S_at 


no_CIS 


1.73 


2.74 


2.33 


2.17 


Hs.82237 


NM_001958; eukaryotlc transla- 
tion elongation factor 1 alpha 2 


216641_s_at 


no_CIS 


1.73 


2.73 


2.31 


2.15 


Hs.18141 


NM_005547; involucrln 


221660_at 


no_CIS 


1.71 


2.67 


2.30 


2.13 


Hs.247831 


NM_000300; phospholipase A2, 
group HA (platelets, synovial 
fluid) 


220026_at 


no_CIS 


1.71 


2.66 


2.28 


2.13 


Hs.227059 


NMJ01442; fatty acid binding 
protein 4, adipocyte 


209591 _s_at 


nq_CIS 


1.69 


2.63 


2.28 


2.11 


Hs.170195 


NM_000228; laminin subunit 
beta 3 precursor 


219922_s_at 


no_ClS 


1.68 


2.61 


2.26 


2.08 


Hs.289019 


NMJ)30570; uroplakin 3B iso- 
form a NM_1 82683; uroptakin 
3B isofbrm c NM_1 82684; uro- 
plakin 3B isoform b 


201641__at 


no_C!S 


1.67 


2.61 


2.26 


2.07 


Hs.118110 


NM_016233; peptidylarginine 
deiminase type III 


204952_at 


no_CIS 


1.66 


2.59 


2.24 


2.07 


Hs.377028 


NM.000445; plectln 1, interme- 
diate filament binding protein 
SOOkDa 


2D44o7_Sjat 


no_ CIS 


1.65 


2.59 


2.23 


2.06 


Hs.367809 


NM_001248; ectonucleoside 
triphosphate diphosphohy- 
drolase 3 


210761_s_at 


no_ClS 


1.64 


2.59 


2.23 


2.05 


Hs.86859 


NM_020142; NADH:ubiquinone 
oxidoreductase MLRQ subunit 
homolog 


217626_at 


no_C!S 


1.63 


2.58 


2.21 


2.04 


Hs.201967 


NM_000299; plakophilin 1 


204380_s_at 


no_CIS 


1.62 


2.58 


2.19 


2.03 


Hs.1420 


NM_007144; ring finger protein 
110 


205455_at 


no_CIS 


1.61 


2.58 


2.17 


2.02 


Hs.2942 


NM_006760; uroplakin 2 


205073_at 


no_ClS 


1.61 


2.58 


2.17 


2.01 


Hs.1 52096 


NM_0 19894; transmembrane 
protease, serine 4 isoform 1 
NM_1 83247; transmembrane 
protease, serine 4 isoform 2 


203287_at 


no CIS 


1.61 


2.58 


9 1fi 


o on 
z.uu 


Mo 


NNL005581; Lutheran blood 
group (Auberger b antigen 
inciuueu; 


210735_s_at 


no_CIS 


1.58 


2.55 


2.15 


1.99 


Hs.5338 


NM_017689; hypothetical pro- 
tein FU20151 


203842_s_at 


no_CIS 


1.57 


2.54 


2.15 


1.97 


Hs.172740 


NM_014417; BCL2 binding 
component 3 


206561 _s_at 


no_CIS 


1.57 


2.53 


2.14 


1.96 


Hs.116724 


NM_015162; lipidosin 


214752_x_at 


no_CIS 


1.56 


2.52 


2.13 


1.95 


Hs.1 95464 


NM_006942; SRY-box 15 
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217028_at 


CIS 


4.87 


5.17 


4.67 


4.40 


Hs.421986 


NM_003282; troponin I, skeletal, 
fast 


213975_s_at 


CIS 


4.65 


4.43 


4.01 


3.76 


Hs.234734 


NM_005522; nomeobox A1 
protein Isofbrm a NM_1 53620; 
homeobox A1 protein Isofbrm b 


201859_at 


CIS 


4.59 


4.15 


3.70 


3.45 


Hs.1908 


NM_0 00497; cytochrome P450, 
subfamily aid (steroid 11-oeta- 
nyaroxyiasej, poiypepnae i 
precursor 


219410_at 


CIS 


4.49 


3.98 


3.49 


3.29 


Hs.1 04800 


NlUJI nn>IAOO> MM ni0707» 

NM — UU4o9Z, l\llvl_ UoZ/Zf, 

intemexin neuronal Intermediate 
filament protein, alpha 


207173_x_at 


CIS 


4.37 


3.88 


3.33 


3.11 


Hs.443435 


NM_000213; integrin, beta 4 


OiARtZA e at 




4.14 


O.OO 


0 00 


O GO 

z.yy 


U(« 4 07,4 00 


NM__002145; homeo box B2 


201858_s_at 


CIS 


4.06 


3.78 


3.09 


2.91 


Hs.1908 


NM_pi ouoo; cartilage acidic 
protein 1 


211430_s_at 


CIS 


4.03 


3.63 


3.05 


2.83 


Hs.413826 


NM_001910; cathepsin E iso- 
fbrm a preproprotein 
NIVM48964; cathepsin E iso- 
form b preproprotein 


0^ *}AQ1 c at 
4m 1 0057 I o a I 




4.00 




0 no 


O "7*7 




NM__0071 93; annexin A1 0 


DO4070 a * 

£^ 1 Of fa Ql 








O AO 

z.oy 


Z.r 0 


|_|e ftOCAT 


NM_001958; eukaryotic transla- 
tion elongation factor 1 alpha 2 


212386_at 


CIS 


3.77 


3.50 


2.87 


2.69 


Hs.359289 


NMJXJ5547; invoiucrin 


211161_s_at 


CIS 


3.76 


3.42 


2.84 


2.65 




NM_00030u; pnospnoiipase A2, 
group HA (platelets, synovial 
fluid) 


91ARRQ v at 




O.OO 


O. Oft 

0.00 


0 on 
Z.oU 


O CO 

2.62 


1 1— 077A7C 

rls.377975 


NMJ)01442; fatty acid binding 
protein 4, adipocyte 


917000 o 0 f 




0.44 


0 04 




O CO 

2.58 


Hs. 444471 


NM_000228; laminin subunit 
beta 3 precursor 


203477_at 


CIS 


3.36 


3.28 


2.75 


2.56 


Hs.409034 


NM _030570; uroplakin 3B iso- 
fbrm a NN/M 82683: uroplakin 

<JD 1 - - t - _ liii iinArn A. - - - ■ 

3B isoform c NwM 82684; uro- 
plakin 3B isoform b 






0. oe 

O.oO 






O CO 


MS. 409798 


NMJM6233; peptidylarginine 
deiminase type III 




Olo 






O 7ft 

2.70 


2.48 


nS.43080 


NM_000445; plectin 1, interme- 
diate filament binding protein j 
500kDa 


215176_x_at 


CIS 


3.32 


3.14 


2.67 


2.45 


Hs.503443 


NM nni94A» or*tnni ir*J oneiric* 

triphosphate diphosphohy- 
drolase 3 j 


201842_s_at 


CIS 


3.31 


3.11 


2.65 


2.44 


Hs.76224 


NMJ320142; NADH ubiquinone 
oxidoreductase MLRQ subunit 
homolog 


212667_at 


CIS 


3.3 


3.11 


2.63 


2.42 


Hs.1 11779 


NM_000299; plakophilin 1 
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209340_at 


CIS 


3.27 


3.10 


2.61 


2.39 


Hs.21293 


NM_007144; ring finger protein 
110 


215379_x_at 


CIS 


3.26 


3.10 


2.59 


2.39 


Hs.449601 


NMJ)06760; uroplakin 2 


200762_at 


CIS 


3.25 


3.05 


2.56 


2.34 


Hs.1 73381 


NM_019894; transmembrane 
protease, serine 4 Isofonm 1 
NM_1 83247; transmembrane 
protease, serine 4 isoform 2 


211896_s_at 


CIS 


3.21 


3.05 


2.53 


2.32 


Hs.156316 


NM_005581; Lutheran blood 
group (Auberger b antigen 
included) 


204141_at 


CIS 


3.19 


3.05 


2.53 


2.28 


Hs.300701 


NM_017689; hypothetical pro- 
tein FU20151 


201744_s_at 


CIS 


3.18 


3.03 


2.50 


2.27 


Hs.406475 


NM_014417; BCL2 binding 
component 3 


209138_x_at 


CIS 


3.17 


3.03 


2.47 


2.24 


Hs.505407 


NM_015162; lipidosin 


214677_x_at 


CIS 


3.14 


3.02 


2.47 


2.23 


Hs.449601 


NMJJ06942; SRY-box 15 


212077_at 


CIS 


3.11 


2.99 


2.46 


2.21 


Hs.443811 


NMJ>03282; troponin I, skeletal, 
fast 


206392_s__at 


CIS 


3.11 


2.98 


2.43 


2.20 


Hs.82547 


NMJ>05522; homeobox A1 
protein isoform a NM_1 53620; 
homeobox A1 protein isoform b 


212998_x_jat 


CIS 


3.09 


2.94 


2.40 


2.19 


Hs.375115 


NMJ500497; cytochrome P450, 
subfamily XIB (steroid 11-beta- 
hydroxylase), polypeptide 1 
precursor 


201616_s_at 


CIS 


3.08 


2.93 


2.38 


2.18 


Hs.443811 


NM_004692; NMJ>32727; 
intemexin neuronal intermediate 
filament protein, alpha 


205382_s_at 


CIS 


3.07 


2.88 


2.37 


2.15 


Hs. 155597 


NM_000213; integrin, beta 4 


212671_s_at 


CIS 


3.07 


2.85 


2.35 


2.14 


Hs.387679 


NMJHJ2145; homeo box B2 


215121_x_at 


CIS 


3.06 


2.84 


2.34 


2.13 


Hs.356861 


NM_0 18058; cartilage acidic 
protein 1 


200600_at 


CIS 


3.06 


2.83 


2.33 


2.11 


Hs.170328 


NM_001910; cathepsin E Iso- 
form a preproprotein 
NM_148964; cathepsin E iso- 
form b preproprotein 


202746_at 


CIS 


3.03 


2.80 


2.32 


2.10 


Hs.1 71 09 


NM_007193; annexin A10 


202917_s_at 


CIS 


3 


2.79 


2.31 


2.08 


Hs.416073 


NM_001958; eukaryotic transla- 
tion elongation factor 1 alpha 2 


201560_at 


CIS 


3 


2.79 


2.30 


2.08 


Hs.25035 


NM_Q05547; Involucrin 


218918_at 


CIS 


2.99 


2.77 


2.29 


2.06 


Hs.8910 


NM_000300; phospholipase A2, 
group MA (platelets, synovial 
fluid) 


218656_s_at 


CIS 


2.99 


2.76 


2.27 


2.06 


Hs.93765 


NM_001442; fatty acid binding 
protein 4, adipocyte 


201088_at 


CIS 


2.99 


2.76 


2.26 


2.04 


Hs.1 59557 


NM_000228; laminin subunit 
beta 3 precursor 
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201291_s_at 


CIS 


2.97 


2.75 


2.25 


2.04 


Hs.1 56346 


NM_030570; uroplakin 3B iso- 
form a NM_1 82683; uroplakin 
3B Isoform c NM_1 82684; uro- 
plakin 3B Isoform b 


215076_s_at 


CIS 


2.95 


2.72 


2.24 


2.03 


Hs.443625 


NM_016233; peptidylarglnlne 
delmlnase type III 


212195_at 


CIS 


2.94 


2.71 


2.22 


2.02 


Hs.71968 


NM_000445; plectln 1, Interme- 
diate filament binding protein 
500kDa 


209732_at 


CIS 


2.94 


2.68 


2.22 


2.00 


Hs.85201 


NM_001248; ectonucleoside 
triphosphate diphosphohy- 
drolase 3 


212192_at 


CIS 


2.94 


2.67 


2.22 


1.99 


Hs. 109438 


iNivMJ^ui4Z t [MAUn. ubiquinone 
oxidoreductase MLRQ subunit 
homolog 


221671_x_at 


CIS 


2.92 


2.67 


2.20 


1.98 


Hs.377975 


NM.000299; plakophilin 1 


211671_s_at 


CIS 


2.91 


2.66 


2.20 


1.98 


Hs.1 26608 


NM_007144; ring finger protein 
110 


214352_s_at 


CIS 


2.88 


2.66 


2.19 


1.97 


Hs.412107 


NM_006760; uroplakin 2 



Feature: Probe-set on U133A GeneChip 
Class: The group in which the marker is up-regulated 
T-test: The t-test value 
5 Perm 1%: The 1% permutation level 
Perm 5%: The 5% permutation level 
Perm 10%: The 10% permutation level 

Construction of a molecular CIS classifier 
10 A classifier able to diagnose CIS from gene expressions in TCC or in bladder biopsies may 
increase the detection rate of CIS. Our first approach was to be able to classify superficial 
TCC with or without CIS in the surrounding mucosa. This could have the diverse effect that 
the number of random biopsies to be taken could be reduced. 

We build a CIS-classifier as previously described (Dyrskjot et al. 2003) using cross-validation 
15 for determining the optimal number of genes for classifying CIS with fewest errors. The best 
classifier performance (1 error) was obtained in cross-validation loops using 25 genes (see 
figure 16); 16 of these were included in 70% of the cross-validation loops and these were 
selected to represent our final classifier for CIS diagnosis (Fig. 17a and table 21). 
Permutation analysis shoved that 13 of these were significant at a 1% confidence level - the 
20 remaining three genes were above a 10% confidence level. 
Table 21 . The 16 gene molecular classifier of CIS 



Feature - 






Perm 


Perm 


Perm;.:: 

— - 


OnfGene.-. : 


RefSeqtfescription 


(U133a 


Class 


t-test 


1% 


5% 


$0% 
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arrayf 
















213633_a 
t 


no_CIS 


1.51 


2.46 


2.04 


1.85 


Hs.97858 


NM_018957; SH3-domaln 
binding protein 1 


212784_a 
t 


no_CIS 


1.36 


2.27 


1.86 


1.70 


Hs.388236 


NMJ)15125; capicua 
homolog 


209241_x 
_at 


no_CIS 


1.13 


1.78 


1.48 


1.33 


Hs. 112028 


NMJ)15716; mls- 
shapen/NIK-related ki- 
nase tsoform 1 
NM_1 53827; mis- 
shapen/NiK-related ki- 
nase isoform 3 
NM_1 70663; mis- 
shapen/NIK-reiated ki- 
nase isoform 2 


217941_s 
_at 


CIS 


2.3 


1.96 


1.66 


1.47 


Hs.8117 


NM_018695; erbb2 inter- 
acting protein 


201877_s 
_at 


CIS 


2.27 


1.90 


1.62 


1.45 


Hs.249955 


NM_002719; gamma 
isoform of regulatory 
subunit B56, protein 
phosphatase 2A isoform a 
NMJ 78586; gamma 
isoform of regulatory 
subunit B56, protein 
phosphatase 2A isoform b 
NNM78587; gamma 
Isoform of regulatory 
subunit B56 t protein 
phosphatase 2A isoform c 
NM_1 78588; gamma 
isoform of regulatory 
subunit B56, protein 
phosphatase 2A isoform d 


209630_s 
_at 


CIS 


1.97 


1.54 


1.31 


1.15 


Hs.444354 


NM_012164; F-boxand 
WD-40 domain protein 2 


202777_a 
t 


CIS 


1.93 


1.51 


1.29 


I 1.12 


Hs.104315 


NM_007373; soc-2 sup- 
pressor of clear homolog 


200958_s 
_at 


CIS 


1.92 


1.49 


1.28 


1.11 


Hs.164067 


NMJ)05625; syndecan 
binding protein (syntenin) 


209579_s 
-at 


CIS 


1.79 


1.36 


1.16 


1.01 


Hs.35947 


NM_003925; methyi-CpG 
binding domain protein 4 


209004_s 
_at 


CIS 


1.63 


1.21 


1.00 


0.89 


Hs.5548 


NM.012161; F-boxand 
leucine-rich repeat protein 
5 isoform 1 NM_033535; 
F-box and leucine-rich 
repeat protein 5 isoform 2 


218150_a 
t 


CIS 


1.6 


1.18 


j 0.98 


0.86 


Hs.342849 


NM_012097; ADP- 
ribosyiation factor-like 5 
isoform 1 NNM77985; 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




T/DK2003/000750 



133 

















ADP-ribosylation factor- 
like 5 isoform 2 


202076_a 
t 


CIS 


1.53 


1.12 


0.92 


0.82 


Hs.289107 


NMJD01166; baculovlral 
IAP repeat-containing 
protein 2 


204640_s 
_at 


CIS 


1.45 


1.03 


0.83 


0.75 


Hs.129951 


NM_003563; speckle-type 
POZ protein 


201887_a 
t 


CIS 


1.32 


0.92 


0.74 


0.66 


Hs.285115 


NM_001560; interleukin 
13 receptor, alpha 1 
precursor 


212802_s 
-at 


CIS 


1.31 


0.91 


0.72 


0.65 


Hs.287266 




212899 a 
t 


CIS 


1.29 


0.89 


0.71 


0.64 


Hs. 129836 


NM_015076; cyclin- 
dependent kinase (CDC2- 
like) 11 



Feature: Probe-set on U133AGeneChip 
Class: The group in which the marker is up-regulated 
T-test: The t-test value 
5 Perm 1%: The 1 % permutation level 
Perm 5%: The 5% permutation level 
Perm 10%: The 10% permutation level 

Exploration of strength of CIS classifier 

10 To further explore the strength of classifying CIS we also built a classifier by randomly 
selecting half of the samples for training and used the other half for testing. Cross validation 
was used again in the training of this classifier for optimisation of the gene-set for classifying 
independent samples. Cross-validation with 15 genes showed a good performance (see 
figure 18) and 7 of these genes were included in 70% of the class-validation loops. These 7 

15 genes classified the samples in the test set with one error only - sample 1482-1 (x 2 -test, 
PO.002). Only two of the genes were also included in the 16-gene classifier, which is 
understandable considering the number of tests performed and the limitations in sample 
size. This classification performance is notable considering the small number of samples 
used for training the classifier. 

20 

Grouping of normal and cystectomies with CIS 

We used hirarchichal cluster analysis to group the 9 normal and 10 biopsies from 
cystectomies with CIS based on the normalised expression profiles of the 16 classifier genes 
(Fig. 17b). This clustering separated the samples from cystectomies with CIS lesions from 
25 the normal samples with only few exceptions as 8 of the 10 biopsies from cystectomies were 
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found in the one main branch of the dendrogram and 8 of the 9 normal biopsies were found 
on the other main branch (% 2 -test, P<0.002). 



5 
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Claims 

1 . A method of predicting the prognosis of a biological condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue and/or expression prod- 
ucts from the cells, 

determining an expression level of at least one gene in the sample, said gene being se- 
lected from the group of genes consisting of gene No. 1 to gene No. 562, 

correlating the expression level to at least one standard expression level to predict the 
prognosis of the biological condition in the animal tissue. 

2. The method of claim 1 , wherein the animal tissue is selected from body organs. 

3. The method of claim 2, wherein the animal tissue is selected from epithelial tissue in 
body organs. 

4. The method of claim 3, wherein the animal tissue is selected from epithelial tissue in the 
20 urinary bladder. 

5. The method according to claim 4, wherein the stage is selected from bladder cancer 
stages Ta, Carcinoma in situ (CIS), T1 , T2, T3 and T4. 

25 6. The method according to claim 5, comprising determining at least the expression of a Ta 
stage gene from a Ta stage gene group, at least one T1 stage gene from a T1 stage 
gene group, at least a T2 stage gene from a T2 stage gene group, at least a T3 stage 
gene from a T3 stage gene group, at least a T4 stage gene group from a T4 stage gene 
group, wherein at least one gene from each gene group is expressed in a significantly 

30 different amount in that stage than in one of the other stages. 

7. The method according to claim 4, 5 or 6, wherein the stage is bladder cancer stage Ta. 

8. The method according to claim 4, wherein the animal tissue is mucosa. 



35 



9. The method of any of the preceding claims, wherein the biological condition is an adeno- 
carcinoma, a carcinoma, a teratoma, a sarcoma, and/or a lymphoma and/or carcinoma- 
in-situ, and/or dysplasia-in-situ. 
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10. The method of any of the preceding claims, wherein the sample is a biopsy of the tissue 
or of metastasis originating from said tissue. 



11. The method according to any of the preceding claim 1-6, wherein the sample is a cell 
5 suspension made from the tissue. 



12. The method according to any of the preceding claims, wherein the sample comprises 
substantially only cells from said tissue. 

10 13. The method according to claim 9, wherein the sample comprises substantially only cells 
from mucosa or tumors derived from said mucosa cells. 



14. The method according to any of the preceding claims, wherein the gene from the group 
of genes is selected individually from gene No. 1 to gene No. 188 (stages). 

15 

15. The method according to any of the preceding claims 1-13, wherein the gene from the 
group of genes is selected individually from gene No. 189 to gene No. 214 (recurrence). 

16. The method according to any of the preceding claims 1-13, wherein the gene from the 
20 group of genes is selected individually from gene No. 215 to gene No. 232 (SCC). 

17. The method according to any of the preceding claims 1-13, wherein the gene from the 
group of genes is selected individually from gene No. 233 to gene No. 446 (progression). 

25 18. The method according to any of the preceding claims 1-13, wherein the gene from the 
group of genes is selected individually from gene No. 447 to gene No. 562 (CIS). 



19. The method according to any of the preceding claims, wherein the expression level of at 
least two genes from the group of genes are determined. 

30 

20. The method according to any of the preceding claims, wherein the expression level of at 
least three genes from the group of genes are determined. 



21. The method according to any of the preceding claims, wherein the expression level of at 
35 least four genes from the group of genes are determined. 

22. The method according to any of the preceding claims, wherein the expression level of at 
least five genes from the group of genes are determined. 
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23. The method according to any of the preceding claims, wherein the expression level of 
more than six genes from the group of genes are determined. 

24. The method according to any of the preceding claims, wherein the difference in expres- 
5 sion level of a gene from the gene group to the at least one standard expression level is 

at least two-fold. 

25. The method according to any of the preceding claims, wherein the difference in expres- 
sion level of a gene from the gene group to the at least one standard expression is at 

10 least three-fold. 

26. The method according to any of the preceding claims, wherein the difference in expres- 
sion level of a gene from the gene group to the at least one standard expression is at 
least four-fold. 

15 

27. The method according to any of the preceding claims, wherein the expression level is 
determined by determining the mRNA of the cells. 

28. The method according to any of the claims 1-26, wherein the expression level is deter- 
20 mined by determining expression products, such as peptides, in the cells. 

29. The method according to claim 28, wherein the expression level is determined by deter- 
mining expression products, such as peptides, in the body fluids, such as blood, serum, 
plasma, faeces, mucus, sputum, cerebrospinal fluid, and/or urine. 

25 

30. The method according to any of the preceding claims, wherein the stage of the biological 
condition has been determined prior to the prediction of the prognosis. 

31 . The method according to claim 30, wherein the stage of the biological condition has 
30 been determined by histological examination of the tissue or by genotyping of the tissue. 

32. The method according to claim 28 or 29, wherein the stage of the biological condition 
has been determined by genotyping of the tissue. 

35 33. The method according to claim 31 or 32, wherein the stage of the biological condition 
has been determined by 



determining the expression of at least a first stage gene from a first stage gene group 
and/or at least a second stage gene from a second stage gene group, wherein at least 
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one of said genes is expressed in said first stage of the condition in a higher amount 
than in said second stage, and the other gene is a expressed in said first stage of the 
condition in a lower amount than in said second stage of the condition, 

5 correlating the expression level of the assessed genes to a standard level of expression 

determining the stage of the condition. 

34. The method according to any of the preceding claims, wherein the expression level of at 
least two genes is determined, by 

10 

determining a first expression level of at least one gene from a first gene group, wherein 
the gene from the first gene group is selected from the group of gene No. 237, 238, 
239, 240, 241, 242, 243, 245, 246, 247, 248, 250, 253, 254, 257, 258, 260, 263, 
264, 265, 267, 270, 271, 272, 278, 283, 284, 287, 288, 290, 291, 292, 294, 297, 
15 298, 300, 302, 303, 305, 309, 310, 315, 316, 317, 318, 319, 321, 324, 329, 335, 

336, 337, 339, 340, 344, 346, 347, 354, 356, 358, 359, 362, 364, 365, 368, 369, 
371, 372, 377, 378, 379, 380, 381, 382, 383, 384, 388, 391, 393, 395, 396, 397, 
399, 402, 403, 404, 409, 413, 417, 419, 420, 421, 422, 423, 425, 427 ,429, 430, 
431 , 432, 437, 444 (progressorgener), and 

20 

determining a second expression level of at least one gene from a second gene group, 
wherein the second gene group is selected from the group of genes No. 233, 234, 235, 
236, 244, 249, 251 , 252, 255, 256, 259, 261 , 262, 266, 268, 269, 273, 274, 275, 
276, 277, 279, 280, 281, 282, 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 
25 307, 308, 311, 312, 313, 314 , 320 , 322, 323, 325, 326, 327, 328 , 330, 331, 

332, 333, 334, 338, 341 , 342, 343, 345, 348, 349, 350, 351 , 352, 353, 355, 357, 
360, 361, 363, 366, 367, 370, 373, 374, 375, 376, 385, 386, 387, 389, 390, 392, 
394, 398, 400, 401, 405, 406, 407, 408, 410, 411, 412, 414, 415, 416, 418, 424, 
426, 428, 433, 434, 435, 436, 438, 439, 440, 441, 442, 443, 445, 446 (non- 
30 progressorgener), and 

correlating the first expression level to a standard expression level for progressors, 
and/or the second expression level to a standard expression level for non-progressors to 
predict the prognosis of the biological condition in the animal tissue. 



35 



35. A method of determining the stage of a biological condition in animal tissue, 
comprising collecting a sample comprising cells from the tissue, 
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10 



determining an expression level of at least one gene selected from the group of genes 
consisting of geneNo 1 to gene No. 562 

correlating the expression level of the assessed genes to at least one standard level of 
expression determining the stage of the condition. 

36. The method according to claim 36, wherein the expression level of at least two genes is 
determined by 



determining the expression of at least a first stage gene from a first stage gene group 
and at least a second stage gene from a second stage gene group, wherein at least one 
of said genes is expressed in said first stage of the condition in a higher amount than in 
said second stage, and the other gene is a expressed in said first stage of the condition 
15 in a lower amount than in said second stage of the condition, and 



correlating the expression level of the assessed genes to a standard level of expression 
determining the stage of the condition 

20 37. The method according to claim 35 or 36, wherein the stage is selected from bladder 
cancer stages Ta, carcinoma in situ (CIS), T1 , T2, T3 and T4. 



38. The method according to claim 37, comprising determining at least the expression of a 
Ta stage gene from a Ta stage gene group, at least one T1 stage gene from a T1 stage 
25 gene group, at least a T2 stage gene from a T2 stage gene group, at least a T3 stage 

gene from a T3 stage gene group, at least a T4 stage gene group from a T4 stage gene 
group, wherein at least one gene from each gene group is expressed in a significantly 
different amount in that stage than in one of the other stages. 



30 39. The method according to claim 38, wherein a Ta stage gene is selected individually from 
the group of Table B1 . 



40. The method according to claim 38, wherein a T1 stage gene is selected individually from 
the group of Table B2. 

35 

41. The method according to claim 38, wherein a T2 stage gene is selected individually from 
the group of Table B3. 
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42. The method according to any of claims 35-41 , said method comprising one or more of 
the features defined in any of the claims 1-34. 

43. A method of determining an expression pattern of a bladder cell sample, comprising: 

5 

collecting sample comprising bladder cells and/or expression products from bladder 
cells, 



determining the expression level of at least one gene in the sample, said gene being 
10 selected from the group of genes consisting of gene No. 1 to gene No. 562, and obtain- 

ing an expression pattern of the bladder cell sample. 

44. The method according to claim 43, wherein the expression level of at least two genes 
are determined. 

15 

45. The method according to claim 43, wherein the expression level of at least three genes 
are determined. 



46. The method according to claim 43, wherein the expression level of at least four genes 
20 are determined. 



47. The method according to claim 43, wherein the expression level of at least five genes 
are determined. 



25 48. The method according to claim 43, wherein the expression level of more than six genes 
are determined. 



49. The method of claims 43-48, wherein the genes exclude genes which are expressed in 
the submucosal, muscle, or connective tissue, whereby a pattern of expression is formed 
30 for the sample which is independent of the proportion of submucosal, muscle, or con- 

nective tissue cells in the sample. 



50. The method of claim 49, comprising determining the expression level of one or more 
genes in the sample comprising predominantly submucosal, muscle, and connective tis- 
35 sue cells, obtaining a second pattern, subtracting said second pattern from the expres- 

sion pattern of the bladder cell sample, forming a third pattern of expression, said third 
pattern of expression reflecting expression of the bladder mucosa or bladder cancer cells 
independent of the proportion of submucosal, muscle, and connective tissue cells pres- 
ent in the sample. 
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51 . The method of any of the preceding claims 43-50, wherein the sample is a biopsy of the 
tissue. 

5 52. The method according to any of the preceding claim 43-51 , wherein the sample is a cell 
suspension. 

53. The method according to any of the preceding claims 43-52, wherein the sample com- 
prises substantially only cells from said tissue. 



10 



35 



54. The method according to claim 53, wherein the sample comprises substantially only cells 
from mucosa. 



55. A method of predicting the prognosis a biological condition in human bladder tissue 
15 comprising, 

collecting a sample comprising cells from the tissue, 

determining an expression pattern of the cells as defined in any of claims 43-54, 

20 

correlating the determined expression pattern to a reference pattern, 

predicting the prognosis of the biological condition of said tissue. 

25 56. A method for determining the stage of a biological condition in animal tissue 
comprising, 

collecting a sample comprising cells from the tissue, 
30 determining an expression pattern of the cells as defined in any of claims 43-54, 

correlating the determined expression pattern to a reference pattern, 
determining the stage of the biological condition is said tissue. 



57. A method for reducing cell tumorigenicity or malignancy of a cell, said method 
comprising 



WO 2004/040014 



• 




'CT7DK2003/000750 



10 



15 



20 



25 



30 



contacting a tumor cell with at least one peptide expressed by at least one gene selected 
from the group of genes consisting of gene Nos. 200-214, 233, 234, 235, 236, 244, 249, 251 , 
252, 255, 256, 259, 261, 262, 266, 268, 269, 273, 274, 275, 276, 277, 279, 280, 281, 282, 
285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 307, 308, 311, 312, 313, 314 , 320 , 322, 
323, 325, 326, 327, 328 . 330, 331, 332, 333, 334. 338, 341, 342, 343, 345, 348, 349, 350. 
351, 352. 353, 355, 357, 360, 361, 363, 366. 367, 370, 373, 374, 375, 376, 385, 386, 387, 
389, 390. 392, 394, 398, 400, 401, 405, 406, 407, 408, 410, 411, 412. 414, 415, 416, 418. 
424, 426. 428, 433, 434, 435, 436, 438, 439, 440, 441, 442, 443, 445, 446, 453. 460. 461. 
463, 464, 465, 466, 467, 469, 470, 471, 472, 473, 475, 476, 477. 479, 480, 481, 482, 483, 
485. 486, 487, 488, 490, 492, 494, 496, 497, 498 , 499, 503, 515, 516. 517. 521, 526, 527. 
528. 530 ,532. 533, 537, 539, 540, 541 , 542, 543, 545, 554, 557, 560, 

58. The method according to claim 57, wherein the tumor cell is contacted with at least two 
different peptides. 

59. A method for reducing cell tumorigenicity of a cell, said method comprising 

obtaining at least one gene selected from the group of genes consisting of gene No. 200- 
214, 233, 234, 235, 236, 244, 249, 251, 252, 255, 256, 259, 261, 262. 266. 268, 269, 273. 
274, 275. 276. 277, 279, 280, 281, 282, 285, 286, 289, 293, 295, 296, 299. 301, 304, 306, 
307, 308, 311, 312, 313, 314 , 320 , 322, 323, 325. 326, 327, 328 , 330, 331, 332, 333, 334. 
338, 341, 342, 343, 345, 348, 349, 350, 351, 352. 353. 355, 357, 360, 361, 363, 366, 367. 
370, 373, 374, 375, 376, 385, 386, 387, 389, 390, 392, 394, 398, 400, 401, 405, 406, 407, 
408, 410, 411. 412, 414, 415, 416. 418, 424, 426, 428, 433, 434, 435, 436, 438, 439, 440, 
441, 442, 443, 445, 446, 453, 460, 461, 463, 464, 465, 466, 467, 469, 470, 471, 472, 473. 
475. 476. 477, 479, 480, 481, 482, 483, 485, 486, 487, 488, 490, 492, 494, 496, 497, 498 , 
499, 503, 515. 516. 517, 521, 526, 527, 528, 530 .532. 533, 537, 539. 540. 541. 542, 543. 
545, 554, 557, 560, 

introducing said at least one gene into the tumor cell in a manner allowing expression 
of said gene(s). 

60. The method according to claim 59, wherein at least one gene is introduced into the 



61. The method according to claim 59 or 60, wherein at least two different genes are 
introduced into the tumor cell. 



tumor cell. 
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62. A method for reducing cell tumorigenicity or malignancy of a cell, said method 
comprising 



obtaining at least one nucleotide probe capable of hybridising with at least one gene of 
5 a tumor cell, said at least one gene being selected from the group of genes consisting 

Of gene Nos. 1-199, 215-232, 237, 238, 239, 240, 241 , 242, 243, 245, 246, 247, 248, 
250, 253, 254, 257, 258, 260, 263, 264, 265, 267, 270, 271, 272, 278. 283, 284, 287, 
288, 290, 291, 292, 294, 297, 298, 300, 302, 303, 305, 309, 310, 315, 316, 317, 318, 
319, 321, 324, 329, 335, 336, 337, 339, 340, 344, 346, 347, 354, 356, 358, 359, 362, 

1 0 364, 365, 368, 369, 371 , 372, 377, 378, 379, 380, 381 , 382, 383, 384, 388. 391 , 393, 

395, 396, 397, 399, 402, 403, 404, 409, 413, 417, 419, 420, 421, 422, 423, 425, 427 
,429, 430, 431, 432, 437, 444, 447, 448, 449, 450, 451, 452, 454, 455 ,456, 457, 458, 
459, 462, 468, 474, 478, 484, 489, 491, 493, 495, 500, 501, 502, 504, 505, 506, 507, 
508, 509, 510, 511, 512, 513, 514, 518 , 519, 520, 522, 523, 524, 525, 529, 531, 534, 

15 535, 536, 538, 544, 546, 547, 548, 549, 550, 551, 552, 553, 555, 556, 558, 559, 561, 

562, 



introducing said at least one nucleotide probe into the tumor cell in a manner allowing 
the probe to hybridise to the at least one gene, thereby inhibiting expression of said at 
20 least one gene. 

63. The method according to claim 62, wherein at least one gene is introduced into the 
tumor cell. 



25 64. The method according to claim 62 or 63, wherein at least two different genes are 
introduced into the tumor cell. 



65. A pharmaceutical composition for the treatment of a biological condition comprising at 
least one antibody against an expression product of a cell from a biological tissue 
30 produced by 



obtaining expression product(s) from at least one gene said gene being selected from 
the group of genes consisting of genes as defined in claim 62, 



35 



immunising a mammal with said expression product(s) obtaining antibodies against 
the expression product. 
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66. A pharmaceutical composition for the treatment of a biological condition comprising at 
least one peptide, said peptide being an expression product from a gene selected from 
the group consisting of genes Nos. 1 -562 of or a fragment thereof. 

5 67. A vaccine for the prophylaxis or treatment of a biological condition comprising at least 
one expression product from at least one gene said gene being selected from the group 
of genes consisting of gene as defined in claim 62. 

68. Use of a method as defined in any of claims 1-64 for producing an assay for diagnosing 
10 a biological condition in animal tissue. 

69. Use of a at least one expression product from at least one gene for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal tissue. 

1 5 70. Use of a gene, said gene being selected from the group of genes consisting of gene No. 

1 to gene No. 562, for the preparation of a pharmaceutical composition for the 
treatment of a biological condition in animal tissue. 

71. Use of a probe as defined in any of claims 62-64 for the preparation of a pharmaceutical 
20 composition for the treatment of a biological condition in animal tissue. 

72. An assay for predicting the prognosis of a biological condition in animal tissue, 

comprising 

25 at least one first marker capable of detecting an expression level of at least one gene 

selected from the group of genes consisting of gene No. 1 to gene No. 562. 

73. The assay according to claim 72, wherein the marker is a nucleotide probe. 

30 74. The assay according to claim 72, wherein the marker is an antibody. 

75. The assay according to claim 72, comprising at least a first marker and/or a second 
marker, wherein the first marker is capable of detecting a gene from a first gene group 
as defined in claim 34, and/or the second marker is capable of detecting a gene from a 
35 second gene group as defined in claim 34. 



76, The assay according to any of claims 72-75, said assay further comprising means for 
correlating the expression level of the at least one gene to a standard expression level 
and/or a reference expression pattern. 
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Figure 14 
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3. rn Claims Nos.: 

because they are dependent claims and are not drafted In accordance with the second and third sentences of Rule 6.4(a). 

Box II Observations where unity of invention Is lacking (Continuation of Item 2 of first sheet) 

This International Searching Authority found multiple inventions in this international application, as follows: 

see additional sheet 



t . I I As all required additional search fees were timely paid by the applicant, this International Search Report covers all 
I 1 searchable claims. 

2. | | As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 
of any additional fee. 



3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
1 — ' covers only those claims for which fees were paid, specifically claims Nos.: 



4. | y 1 No required additional search fees were timely paid by the applicant Consequently, this Internationa) Search Report is 
^ restricted to the Invention first mentioned in the claims; it is covered by claims Nos.: 

1-14, 19-33, 35-56 and 62-76 (all claims partially) 



Remark on Protest | | The additional search fees were accompanied by the applicant's protest 

| ~| No protest accompanied the payment of additional search fees. 



Form PCT/ISA/210 (continuation of first sheet (1)) (July ig98) 



INTERNATIONAL SEARCH REPORT 



Internationa! Application No. PCT/ DK 03/00750 



FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



Continuation of Box 1.1 

Claims Nos.: 1-64 and 68 (al partially) 



Claims 1-56 and 68 relate to methods of treatment of the human or animal 
body by surgery or by therapy or diagnostic methods practiced on the 
human or animal body (PCT Rule 39.1(iv)). The methods include a step of 
collecting a sample, which does not exclude that the sample is collected 
in vivo. The search has been executed with the assumption that the 
collection is not done in vivo. 

Claims 57-64 and 68 relate to methods of treatment of the human or animal 
body by surgery or by therapy or diagnostic methods practiced on the 
human or animal body (PCT Rule 39.1(iv)). The methods include a step of 
contacting a tumor cell with a peptide or introducing a probe into a 
tumor cell, which does not exclude that the method is carried out in 
vivo. The search has been executed with the assumption that the methods 
are not carried out in vivo. 
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Claims Nos.: 1-4, 8-36, 42, 55-56 and 68-76 (all partially) 



The wording of the present claims 1-4, 8-36, 42, 55-56 and 68-76 renders 
it difficult, if not impossible, to determine the matter for which 
protection is sought, due to the expression "biological condition". 
Therefore, the present application fails to comply with the clarity and 
conciseness requirements of Article 6 PCT (see also Rule 6.1(a) PCT) to 
such an extent that a meaningful search on the basis of the claims is 
impossible. Consequently, the search has been carried out for those parts 
of the application which do appear to be clear and concise, namely the 
biological condition bladder cancer. 

The applicant's attention is drawn to the fact that claims, or parts of 
claims, relating to inventions in respect of which no international 
search report has been established need not be the subject of an 
international preliminary examination (Rule 66.1(e) PCT). The applicant 
is advised that the EPO policy when acting as an International 
Preliminary Examining Authority is normally not to carry out a 
preliminary examination on matter which has not been searched. This is 
the case irrespective of whether or not the claims are amended following 
receipt of the search report or during any Chapter II procedure. 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

Invention 1: claims 1-14, 19-33, 

35-56 and 62-76 (all claims partially) 



Method for predicting the prognosis of a biological 
condition using gene no. 1 as well as additional 
applications of gene no. 1. 



Invention 2: claims 1-14, 19-33, 

35-56 and 62-76 (all claims partially) 



Method for predicting the prognosis of a biological 
condition using gene no. 2 as well as additional 
applications of gene no. 2. 

etc. etc... 



Invention 188: claims 1-14, 19-33, 

35-56 and 62-76 (all claims partially) 



Method for predicting the prognosis of a biological 
condition using gene no. 188 as well as additional 
applications of gene no. 188. 



Invention 189: claims 1-4, 8-13, 15, 19-33, 

35-36 and 42-76 (all claims except claim 15 
partially) 



Method for predicting the prognosis of a biological 
condition using genes no. 189-214 ,i.e. the part of the 
claims relating to genes associated with recurrence, as well 
as additional applications of gene no. 189-214. 



Invention 190: claims 1-4, 8-13, 16, 19-33, 35-36, 

42-56 and 62-76 (all claims except claim 16 
partially) 



Method for predicting the prognosis of a biological 
condition using genes no. 215-232 ,i.e. the part of the 
claims relating to genes associated with squamous metaplasi, 
as well as additional applications of gene no. 215-232. 



Invention 191: claims 1-4, 8-13, 17, 
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19-36 and 42-76 (all claims except claims 17 and 
34 partially) 



Method for predicting the prognosis of a biological 
condition using genes no. 233-446 ,i.e. the part of the 
claims relating to genes associated with progression, as 
well as additional applications of gene no. 233-446. 



Invention 192: claims 1-5, 8-13, 18-33, 

35-37 and 42-76 (all claims except claim 18 
partially) 



Method for predicting the prognosis of a biological 
condition using genes no. 447-562 ,i.e. the part of the 
claims relating to genes associated with carcinoma in situ, 
as well as additional applications of gene no. 447-562. 
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Gene expression in biological conditions 



Technical field of the invention 

5 The present invention relates to a method of predicting the prognosis of a biological 
condition in animal tissue, wherein the expression of genes is examined and correlated to 
standards. The invention further relates to the treatment of the biological condition and an 
assay for predicting the prognosis. 

10 Background 

The building of large databases containing human genome sequences is the basis for 
studies of gene expressions in various tissues during normal physiological and pathological 
conditions. Constantly (constitutively) expressed sequences as well as sequences whose 

15 expression is altered during disease processes are important for our understanding of 
cellular properties, and for the identification of candidate genes for future therapeutic 
intervention. As the number of known genes and ESTs build up in the databases, array- 
based simultaneous screening of thousands of genes is necessary to obtain a profile of 
transcriptional behaviour, and to identify key genes that either alone or in combination with 

20 other genes, control various aspects of cellular life. One cellular behaviour that has been a 
mystery for many years is the malignant behaviour of cancer cells. It is now known that for 
example defects in DNA repair can lead to cancer but the cancer-creating mechanism in 
heterozygous individuals is still largely unknown as is the malignant cell's ability to repeat 
cell cycles to avoid apoptosis to escape the immune system to invade and metastasize and 

25 to escape therapy. There are indications in these areas and excellent progress has been 
made, buth the myriad of genes interacting with each other in a highly complex 
multidimensional network is making the road to insight long and contorted. 

Similar appearing tumors - morphologically, histochemically, microscopically - can be 
30 profoundly different. They can have different invasive and metastasizing properties, as well 
as respond differently to therapy. There is thus a need in the art for methods which 
distinguish tumors and tissues on factors different than those currently in clinical use. 
The malignant transformation from normal tissue to cancer is believed to be a multistep 
process, in which tumorsuppressor genes, that normally repress cancer growth show re- 
35 duced gene expression and in which other genes that encode tumor 
promoting proteins (oncogenes) show an increased expression level. Several tumor sup- 
pressor genes have been identified up till now, as e.g. p16, Rb, p53 ( Nesrin Ozflren and 
Wafik S. El-Deiry, Introduction to cancer genes and growth control, In: DNA alterations in 
cancer, genetic and epigenetic changes, Eaton publishing, Melanie Ehrlich (ed) p. 1-43, 
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2000.; and references therein). They are usually identified by their lack of expression or their 
mutation in cancer tissue. 

Other examinations have shown this downregulation of transcripts to be partly due to loss of 
genomic material ( loss of heterozygosity), partly to methylation of promotorregions, and 
partly due to unknown factors ( Nesrin OzOren and Wafik S. El- 
Deiry, Introduction to cancer genes and growth control, In: DNA alterations in cancer, genetic 
and epigenetic changes, Eaton publishing, Melanie Ehrlich (ed) p. 1-43, 2000.; and refer- 
ences therein). 

Several oncogenes are known, e.g. cyclinD1/PRAD1/BCL1, FGFs. c-MYC, BCL-2 all of 
which are genes that are amplified in cancer showing an increased level of transcript ( Nes- 
rin OzOren and Wafik S. El-Deiry, Introduction to cancer genes 
and growth control, In: DNA alterations in cancer, genetic and epigenetic changes, Eaton 
publishing, Melanie Ehrlich (ed) p. 1-43, 2000.; and references therein). Many of these 
genes are related to cell growth and directs the tumor cells to uninhibited 
growth. Others may be related to tissue degradation as they e.g. encode enzymes that break 
down the surrounding connective tissue. 

Bladder cancer is the fourth most common malignancy in males in the western countries 
(Pisani). The disease basically takes two different courses: one where patients have multiple 
recurrences of superficial tumors (Ta and T1), and one where the disease from the begin- 
ning is muscle invasive (T2+) and leads to metastasis. About 5-10% of patients with Ta tu- 
mors and 20-30% of the patients with T1 tumors will eventually develop a higher stage tumor 
(Wolf). Patients with superficial bladder tumors represent 75% of all bladder cancer patients 
and no clinical useful markers identifying patients with a poor prognosis exists at present. 

The patients presenting isolated or concomitant Carcinoma in situ (CIS) lesions have a high 
risk of disease progression to a muscle invasive stage (Althausen). The CIS lesions may 
have a widespread manifestation in the bladder (field disease) and are believed to be the 
most common precursors of invasive carcinomas (Spruck. Rosin). The ability to predict 
which tumours are likely to recur or progress would have great impact on the clinical 
management of patients with superficial disease, as it would be possible to treat high-risk 
patients more aggressively (e.g. radical cystectomy or adjuvant therapy). This approach is 
currently not possible, as no clinical useful markers exist that identify these patients. 
Although many prognostic markers have been investigated, the most important prognostic 
factors are still disease stage, dysplasia grade and especially the presence of areas with CIS 
(Anderstrom, Cummings, Cheng). The gold standard for detection of CIS is urine cytology 
and histopathologic analysis of a set of selected site biopsies removed during routine 
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cytsocopy examinations; however these procedures are not sufficient sensitive. 
Implementing routine cytoscopy examinations with 5-ALA fluorescence imaging of the 
tumours and pre-cancerous lesions (CIS lesions and moderate dysplasia lesions) may 
increase the sensitivity of the procedure (Kriegmar), however, increased detection sensitivity 
5 is still necessary in order to offer better treatment regiments to the individual patients. 

Summary of the invention 

The present invention relates to prediction of prognosis of a biological condition, in particular 
10 to the prognosis of cancer such as bladder cancer It is known that individuals suffering from 
cancer, although their tumors macroscopically and microscopically are identical, may have 
very different outcome. The present inventors have identified new predictor genes to classify 
macroscopically and microscopically identical tumors into two or more groups, wherein in 
each group has a separate risk profile of recurrence, invasive growth, metastasis etc. as 
15 compared to the other group(s). The present invention relates to genotyping of the tissue, 
and correlating the result to standard expression level(s) to predict the prognosis of the bio- 
logical condition. 

Accordingly, in one aspect the present invention relates to a method of predicting the prog- 
20 nosis of a biological condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue and/or expression prod- 
ucts from the cells, 

25 determining an expression level of at least one gene in said sample, said gene being se- 

lected from the group of genes consisting of gene No. 1 to gene No. 562, 



30 



correlating the expression level to at least one standard expression level to predict the 
prognosis of the biological condition in the animal tissue. 

The genes No. 1 - gene No. 562 are found in table A described below herein. 



Animal tissue may be tissue from any animal, preferably from a mammal, such as a horse, a 
cow, a dog, a cat, and more preferably the tissue is human tissue. The biological condition 
35 may be any condition exhibiting gene expression different from normal tissue. In particular 
the biological condition relates to a malignant or premalignant condition, such as a tumor or 
cancer, in particular bladder cancer. By the term "collecting a sample comprising cells" is 
meant the sample is provided in a manner, so that the expression level of the genes may be 
determined. 
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Furthermore, the invention relates to a method of determining the stage of a biological con- 
dition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

5 

determining an expression level of at least one gene in said sample, said gene being se- 
lected from the group of genes consisting of geneNo 1 to gene No. 562, 

correlating the expression level of the assessed genes to at least one standard level of 
1 0 expression determining the stage of the condition. 

The determination of the stage of the biological condition may be conducted prior to the 
method of predicting the method, or the stage of the biological condition may as such contain 
the information about the prognosis. 

15 

The methods above may be used for determining single gene expressions, however the 
invention also relates to a method of determining an expression pattern of a bladder cell 
sample, comprising: 

20 collecting sample comprising bladder cells and/or expression products from bladder 

cells, 

determining the expression level of at least one gene in the sample, said gene being se- 
lected from the group of genes consisting of gene No. 1 to gene No. 562, and obtaining 
25 an expression pattern of the bladder cell sample. 

Further, the invention relates to a method of determining an expression pattern of a bladder 
cell sample independent of the proportion of submucosal, muscle, or connective tissue cells 
present, comprising: 

30 

determining the expression of one or more genes in a sample comprising cells, wherein 
the one or more genes exclude genes which are expressed in the submucosal, muscle, 
or connective tissue, whereby a pattern of expression is formed for the sample which is 
independent of the proportion of submucosal, muscle, or connective tissue cells in the 
35 sample. 



The expression pattern may be used in a method according to this information, and accord- 
ingly, the invention also relates to a method of predicting the prognosis a biological condition 
in human bladder tissue comprising. 
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collecting a sample comprising cells from the tissue, 

determining an expression pattern of the cells as defined in any of claims 43-54, 

5 

correlating the determined expression pattern to a standard pattern, 

predicting the prognosis of the biological condition of said tissue 

10 as well as a method for determining the stage of a biological condition in animal tissue, 

comprising 

collecting a sample comprising cells from the tissue, 
1 5 determining an expression pattern of the cells as defined above, 

correlating the determined expression pattern to a standard pattern, 
determining the stage of the biological condition is said tissue. 

20 

The invention further relates to a method for reducing cell tumorigenicity or malignancy of a 
cell, said method comprising 

contacting a tumor cell with at least one peptide expressed by at least one gene selected 
25 from the group of genes consisting of gene Nos. 200-214, 233, 234. 235, 236, 244, 249, 

251, 252, 255, 256. 259. 261. 262. 266, 268. 269. 273. 274. 275. 276. 277. 279. 280. 281. 

282. 285, 286. 289, 293. 295. 296. 299. 301. 304. 306, 307. 308, 311. 312. 313. 314 . 320 . 

322, 323, 325. 326. 327. 328 . 330. 331. 332. 333. 334. 338. 341. 342. 343. 345. 348, 349, 

350, 351. 352. 353. 355. 357. 360, 361, 363. 366, 367, 370, 373. 374. 375. 376. 385. 386. 
30 387. 389. 390. 392, 394, 398, 400, 401. 405. 406, 407, 408. 410. 41 1. 412. 414, 415, 416. 

418. 424, 426. 428. 433, 434. 435. 436. 438. 439. 440. 441. 442. 443. 445. 446, 453. 460. 

461, 463, 464. 465. 466. 467. 469, 470. 471. 472. 473. 475. 476. 477. 479. 480. 481. 482, 

483. 485, 486. 487. 488. 490, 492, 494, 496. 497. 498 . 499. 503. 515. 516. 517. 521, 526. 

527. 528. 530 .532, 533. 537, 539. 540. 541. 542. 543. 545. 554, 557. 560 or 

35 

obtaining at least one gene selected from the group of genes consisting of gene Nos200- 
214. 233. 234. 235. 236, 244. 249, 251. 252. 255. 256, 259, 261. 262. 266. 268. 269. 273. 
274, 275, 276. 277. 279. 280. 281. 282. 285. 286. 289. 293. 295. 296. 299, 301. 304. 306, 
307, 308. 311. 312. 313. 314 , 320 . 322. 323. 325. 326. 327. 328 , 330. 331, 332. 333. 334. 
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338. 341, 342. 343. 345. 348. 349. 350, 351, 352, 353, 355. 357. 360. 361. 363, 366. 367, 
370. 373. 374. 375. 376. 385, 386. 387, 389, 390. 392. 394. 398, 400. 401, 405. 406. 407. 

408, 410. 411, 412. 414. 415. 416. 418, 424, 426. 428, 433, 434, 435, 436, 438, 439, 440, 
441. 442, 443, 445, 446, 453, 460, 461, 463, 464, 465, 466, 467, 469. 470. 471, 472. 473. 

5 475. 476, 477, 479. 480. 481 , 482. 483. 485, 486, 487, 488, 490, 492. 494, 496. 497, 498 , 
499. 503, 515, 516, 517, 521, 526, 527, 528, 530 .532, 533, 537, 539. 540, 541, 542, 543, 
545, 554, 557, 560, and introducing said at least one gene into the tumor cell in a manner 
allowing expression of said gene(s), or 

obtaining at least one nucleotide probe capable of hybridising with at least one gene of a 
tumor cell, said at least one gene being selected from the group of genes consisting of gene 
Nos. 1-199. 215-232. 237. 238. 239. 240. 241. 242, 243, 245, 246, 247, 248, 250. 253. 254, 
257, 258, 260, 263. 264. 265. 267. 270. 271. 272, 278, 283, 284, 287, 288, 290, 291, 292, 
294. 297, 298. 300, 302, 303. 305, 309. 310. 315. 316. 317. 318, 319, 321, 324, 329, 335. 
336. 337, 339, 340, 344, 346, 347. 354, 356. 358. 359. 362, 364, 365, 368. 369. 371, 372. 
377. 378, 379, 380, 381, 382. 383, 384. 388, 391, 393, 395, 396, 397, 399, 402, 403. 404, 

409, 413, 417, 419, 420, 421, 422. 423. 425. 427 .429. 430. 431. 432. 437. 444, 447. 448, 
449. 450. 451. 452, 454, 455 ,456, 457. 458. 459. 462. 468, 474, 478, 484, 489, 491. 493, 
495, 500, 501, 502, 504, 505, 506, 507, 508, 509. 510, 511, 512, 513, 514. 518 . 519. 520. 
522. 523. 524. 525. 529. 531. 534. 535. 536, 538, 544, 546, 547, 548, 549, 550. 551, 552, 
553, 555, 556, 558, 559, 561, 562, and introducing said at least one nucleotide probe into 
the tumor cell in a manner allowing the probe to hybridise to the at least one gene, thereby 
inhibiting expression of said at least one gene. 

In a further aspect the invention relates to a method for producing antibodies against an 
expression product of a cell from a biological tissue, said method comprising the steps of 

obtaining expression product(s) from at least one gene said gene being expressed as 
defined above, 

immunising a mammal with said expression product(s) obtaining antibodies against the 
expression product. 

The antibodies produced may be used for producing a pharmaceutical composition. Further, 
the invention relates to a vaccine capable of eliciting an immune response against at least 
one expression product from at least one gene said gene being expressed as defined above. 

The invention furthermore relates to the use of any of the methods discussed above for 
producing an assay for diagnosing a biological condition in animal tissue. 
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Also, the invention relates to the use of a peptide as defined above as an expression product 
and/or the use of a gene as defined above and/or the use of a probe as defined above for 
preparation of a pharmaceutical composition for the treatment of a biological condition in 
5 animal tissue. 



In yet a further aspect the invention relates to an assay for determining the presence or ab- 
sence of a biological condition in animal tissue, comprising 

at least one first marker capable of detecting an expression level of at least one gene se- 
lected from the group of genes consisting of gene No. 1 to gene No. 562, 

In another aspect the invention relates to an assay for determining an expression pattern of 
a bladder cell, comprising at least a first marker and and/or a second marker, wherein the 
first marker is capable of detecting a gene from a first gene group as defined above, and the 
second marker is capable of detecting a gene from a second gene group as defined above. 

Drawings 



20 Description of figures: 



Figure 1 Hierarchical cluster analysis of tumor samples based on 3,197 genes that show 
large variation across all tumor samples. Samples with progression are marked Prog. 

25 Figure 2 Delineation of the 200 best marker genes. Genes that show higher levels of 
expression in the non-progression group are shown in the top and genes that show higher 
levels of expression in the progression group is shown in the bottom. Each column in the 
diagram represents a tumor sample and each row a gene. The 13 non-progressing samples 
are shown to the left and the 16 progressing samples are shown to the right in the diagram. 

30 The color saturation indicates differences in gene expression across the tumor samples; light 
color indicates up regulation compared the median expression and down regulation 
compared to the median expression of the gene is shown in dark color. Gene names of 
particular interesting genes are listed. Notable, non-group expression patterns were 
observed for two tumors (arrows). The tumor in the no progression group (150-6) showed a 

35 solid growth pattern, which is associated with a poor prognosis. No special tumor 
characteristics can help explain the gene expression pattern observed for the tumor in the 
progression group (825-3). 



Figure 3. Cross-validation performance using from 1 to 200 genes. 
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Figure 4. Predicting progression in early stage bladder tumors, a, The 45-gene expression 
signature found to be optimal for progression prediction. Genes showing high expression in 
progressing samples are show in the top and genes showing high expression in the non- 
5 progressing samples are shown in the bottom. Genes are listed according to how many 
cross-validation loops included the genes. b, The 45-gene expression signature in the 19 
tumor test-set. The samples are listed according to the correlation to the average non- 
progression signature from the training set samples. The read punctuated line separates 
samples with positive (left) and negative (right) correlation values. The white lines separates 
10 samples above and below the correlation cutoff values of 0.1 and -0.1. The sample legend 
indicates no-progression (N) samples and progression (P) samples. 

Figure 5 Hierarchical cluster analysis of the metachronous tumor samples. Tight clustering 
tumors of different stage from the same patients are colored in grey. 

15 

Figure 6 Two-way hierarchical clustering and multidimensional scaling analysis of gene 
expression data from 40 bladder tumour biopsies, a, Tumour cluster dendrogram based on 
the 1767 gene-set. CIS annotations following the sample names indicate concomitant 
carcinoma in situ. Tumour recurrence rates are shown to the right of the dendrogram as + 

20 and ++ indicating moderate and high recurrence rates, respectively, while no sign indicates 
no or moderate recurrence, b, Tumour cluster dendrogram based on 88 cancer related 
genes, c, 2D plot of multidimensional scaling analysis of the 40 tumours based on the 1767 
gene-set. The colour code identifies the tumour samples from the cluster dendrogram (Fig. 
1a). d, Two-way cluster analysis diagram of the 1767 gene-set. Each row in the diagram 

25 represents a gene and each column a tumour sample. The colour saturation represents 
differences in gene expression across the tumour samples; Igiht color indicates higher 
expression of the gene compared to the median expression and lower expression of the 
gene compared to the median expression shown in dark color. The colour intensities indicate 
degrees of gene-regulation. The sidebars to the right of the diagram represent gene clusters 

30 a-j and normal 1-3 in the left side indicate the three normal biopsies and normal 4 indicates 
the pool of biopsies from 37 patients. 



Figure 7 Enlarged view of the gene clusters a, c, f, and g. The dendrogram at the top is 
identical to Fig. 6a. a, Cluster of transcription factors and other nuclear associated genes, c, 
Cluster of genes involved in proliferation and cell cycle control, f, Gene expression pattern 
and corresponding area with squamous metaplasia in urothelial carcinoma. The light colour 
indicates genes up-regulated in samples 1178-1 and 875-1, the only two samples with 
squamous cell metaplasia, g, Cluster of genes involved in angiogenesis and matrix 
remodelling. 
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Figure 8. Hierarchical cluster analysis results 

Here we show expanded views of clusters a-j as identified in the 1767 gene-cluster. The 
tumour cluster dendrogram and colour bars on top of the clusters represents the same 
5 tumour cluster as shown in the paper. The four samples to the left are normal biopsies 
(normal 1-3) and a pool of 37 normal biopsies (normal 4). 



Figure 8a. Molecular classification of tumour samples using 80 predictive genes in each 
cross-validation loop. Each classification is based on the closeness to the mean in the three 
classes. Samples marked with * were not used to build the classifier. The scale indicates the 
distance from the samples to the classes in the classifier, measured in weighted squared 
Euclidean distance. 



Figure 9 Number of classification errors vs. number of genes used in cross-validation loops. 

15 

Figure 10 Expression profiles of the 71 genes used in the final classifier model.The tumors 
shown are the 33 tumors used in the cross validation scheme. The Ta tumors are shown to 
the left, the T1 tumors in the middle, and the T2 tumors to the right. 

20 Figure 1 1 Number of prediction errors vs. number of genes used in cross-validation loops. 

Figure 12 The expression profiles of the 26 genes that constitute our final prediction model. 
The genes are listed according to the degree of correlation with the recurrence and non- 
recurrence groups. Genes with highest correlations are found in the top and the bottom of 
25 the list. 



Figure 13 . Hierarchical cluster analysis of the gene expression in 41 TCC, 9 normal 
samples and 10 samples from cystectomy specimens with CIS lesions, a, Cluster 
dendrogram of all 41 TCC biopsies based on the expression of 5,491 genes, b, Cluster 
dendrogram of all superficial TCC biopsies based on the expression of 5,252 genes, c, Two- 
way cluster analysis diagram of the 41 TCC biopsies together with gene expressions in the 
normal and cystectomy samples (left columns). Each row represents a gene and each 
column represent a biopsy sample. Yellow indicates up-regulation compared to the median 
expression (black) of the gene and blue indicates down-regulation compared to the median 
expression. The colour saturation indicates degree of gene regulation. The sidebars to the 
right of the diagram represent gene-clusters 1-4; enlarged views of cluster 1 and 4 are 
shown to the right, with all gene symbols listed. 
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Figure 14 . Delineation of the 100 best markers that separate TCC without CIS from TCC 
with concomitant CIS. a, The 50 best up-regulated marker genes in TCC without CiS are 
shown in the top and the 50 best up-regulated marker genes in TCC with CIS are shown in 
the bottom. The gene symbols are listed to the right of the diagram, b, Expression profiles of 
the 100 marker genes in 9 normal biopsies (left column), 5 histologically normal samples 
adjacent to CIS lesions (middle column), and 5 biopsies with CIS lesions detected, (right 
column). 



Figure 15 Cross validation performance using all samples 

10 

Figure 16 Expression profiles of the 16 genes in the CIS classifier, a, the expression of the 
16 classifier genes in TCC with no surrounding CIS (left) and in TCC with surrounding CIS 
(right). The gene symbols of the classifier genes are listed together with the number of the 
times used in cross-validation loops. b, the expression of the 16 classifier genes in normal 
15 samples, in histologically normal samples adjacent to CIS lesions, and in biopsies with CIS 
lesions. The top dendrogram shows the sample clustering from hierarchical cluster analysis 
based on the 16 classifier genes. The genes appear in the same order as in 3a. 

Figure 17 Cross validation performance using half of the samples 

20 

Figure 18 shows table B 
Figure 19 shows table C 
25 Figure 20 shows table D 
Figure 21 shows table E 
Figure 22 shows table F 

30 

Figure 23 shows table G 
Figure 24 shows table H 



35 Detailed description of the invention 



As discussed above the present invention relates to the finding that it is possible to predict 
the prognosis of a biological condition by determining the expression level of one or more 
genes from a specified group of genes and comparing the expression level to at least one 
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standard for expression levels. The present inventors have Identified 562 genes relevant for 
predicting the prognosis of a biological condition, in particular a cancer disease, such as 
bladder cancer. 

The following table A shows the genes relevant in this context. Whenever a gene is cited 
herein with reference to a gene No. the numbering refers to the genes of Table A. 



Table A 



Sene 


GeneChip 


Pro beset 


Unigene 


Unigene 








Build 


1 


HUGeneFL 


AB000220_at 


168 


Hs.171921 


2 


HUGeneFL 


AF000231 at 


168 


Hs.75618 


3 


HUGeneFL 


D10922 s at 


168 


Hs. 99855 


4 


HUGeneFL 


D10925 at 


168 


Hs.301921 


5 


HUGeneFL 


D11086_at 


168 


Hs.84 


6 


HUGeneFL 


D11151 at 


168 


Hs.2 11202 


7 


HUGeneFL 


D13435 at 


168 


Hs.426142 


8 


HUGeneFL 


D13666 s at 


168 


Hs. 136348 


9 


HUGeneFL 


D14520~at 


168 


Hs.84728 


10 


HUGeneFL 


D21878 at 


168 


Hs 169998 


11 


HUGeneFL 


D26443__at 


168 


Hs.371369 


12 


HUGeneFL 


D42046 at 


168 


Hs. 194665 


13 


HUGeneFL 


D45370 at 


168 


Hs 74120 


14 


HUGeneFL 


D49372 s at 


168 


Hs.54460 


15 


HUGeneFL 


D50495~at 


168 


Hs 224397 


16 


HUGeneFL 


D63135 at 


168 


Hs.27935 


17 


HUGeneFL 


D64053~at 


168 


Hs 198288 

1 l-J. 1 9U4UU 


18 


HUGeneFL 


D83920~at 


168 


Hs 440898 


19 


HUGeneFL 


D85131_s_at 


168 


Hs 433881 




HUGeneFL 


D86062 s at 


168 


Hs.413482 


21 


HUGeneFL 


D86479 at 


168 


Hs 439463 


22 


HUGeneFL 


086957 at 


168 




23 


HUGeneFL 


D86959 at 


168 


Hs 105751 


24 


HUGeneFL 


D86976 at 


168 


Mq 1Q8Q1A 


25 


HUGeneFL 


D87433 at 


168 


Hs.301989 


26 


HUGeneFL 


D87443 at 


168 


Hs.409862 


27 


HUGeneFL 


D87682~at 


168 


Hs.134792 


28 


HUGeneFL 


D89077 at 


168 


Hs.75367 


29 


HUGeneFL 


D89377 at 


168 


Hs.89404 


30 


HUGeneFL 


D90279 s at 


168 


Hs.433695 


31 


HUGeneFL 


HG1996- 


168 








HT2044 at 






32 


HUGeneFL 


HG2090- 


168 








HT2152 s at 






33 


HUGeneFL 


HG2463- 


168 








HT2559 at 






34 


HUGeneFL 


HG3044- 


168 








HT3742 s at 






35 


HUGeneFL 


HG3187- 


168 








HT3366 s at 






36 


HUGeneFL 


HG3342- 


168 








HT3519 8 at 






37 


HUGeneFL 


HG371- 


168 








HT26388 s a 




38 


HUGeneFL 


t 

HG4069- 


168 








HT4339 s at 






39 


HUGeneFL 


HG67- 


168 




40 




HT67 f at 




HUGeneFL 


HG907- 


168 





description 

sema domain, immunoglobulin domain (Ig), 
short basic domain, secreted, (semaphorin) 

3C 

RAB1 1A, member RAS oncogene family 
formyl peptide receptor-like 1 
chemokine (C-C motif) receptor 1 
interleukin 2 receptor, gamma (severe com- 
bined immunodeficiency) 
endothelln receptor type A 
phosphatidyiinositol glycan, class F 
osteoblast specific factor 2 (fasctclin l-like) 
Kruppel-like factor 5 (intestinal) 
bone marrow stromal cell antigen 1 
solute carrier family 1 (glial high affinity gluta- 
mate transporter), member 3 
DNA2 DNA replication helicase 2-like (yeast) 
adipose specific 2 
chemokine (C-C motif) ligand 11 
transcription elongation factor A (Sll), 2 
tweety homolog 2 (Drosophila) 
protein tyrosine phosphatase, receptor type, R 
ficolin (collagen/fibrinogen domain containing) 

1 

MYC-associated zinc finger protein (purine- 
binding transcription factor) 
chromosome 21 open reading frame 33 
AE binding protein 1 
likely ortholog of mouse septin 8 
Ste20-related serine/threonine kinase 
minor histocompatibility antigen HA-1 
stabilin 1 
sorting nexin 19 
KIAA0241 protein 
Src-like-adaptor 
msh homeo box homolog 2 (Drosophila) 
collagen, type V, alpha 1 



Classi- 
fier 
stage 



stage 
stage 
stage 



stage 
stage 
stage 
stage 
stage 
stage 

stage 
stage 
stage 
stage 
stage 
stage 
stage 



stage 
stage 
stage 
stage 
stage 
stage 
stage 
stage 
stage 
stage 
stage 



stage 
stage 
stage 
stage 
stage 

stage 
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UTQn7 of 

n i yu f ai 






41 


HUGeneFL 


J02871_s_at 


168 


Hs.436317 


42 


HUGeneFL 


J03040_at 


168 


Hs.1 11779 


43 


HUGenpFI 


imftfifl at 


1 fin 

1 DO 




44 


HUGeneFL 


J03068 at 


168 


z 


45 


HUGeneFL 


J03241_s_at 


168 


Hs.2025 


46 




juozr o_ai 


f CO 

loo 


ns. 307783 


47 


nuooilcrL 


iniono nk 
juoyuy a i 


4 CO 

loo 




48 


HUGeneFL 


J03925_at 


168 


Hs.1 72631 


49 


HUGeneFL 


J04056 at 


1 CO 
1 DO 




50 


HUGeneFL 


J04058lat 


168 


Hs. 16991 9 


51 


HUGeneFL 


J04130 s at 


168 


Hs.75703 


52 


HUGeneFL 


J04152_rna1_ 


168 


— 






s at 






53 


HUGeneFL 


J04162lat 


168 


Hs.372679 


54 


HUGeneFL 


J04456_at 


168 


Hs.407909 


55 


HUGeneFL 


J05032 at 


168 


Hs.32393 


56 


HUGeneFL 


J05070_at 


168 


Hs.151738 


57 


HUGeneFL 


J05448_at 


168 


Hs.79402 


58 


HUGeneFL 


K01396_at 


168 


Hs.297681 



59 
60 
61 
62 
63 
64 
65 
66 
67 
68 
69 
70 
71 
72 



74 
75 
76 
77 
78 



79 
80 

81 
82 
83 



HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 



73 HUGeneFL 



HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 
HUGeneFL 



K03430 at 
L06797 slat 
L 10343 at 
L13391_at 
L13698 at 
L13720_at 
L13923_at 
L15409_at 
L17325_at 
LI 9872 at 
L27476_at 
L33799 at 
L40388lat 
L40904_at 

L41919 ma1 
_at 

M11433_at 
M11718_at 
M12125_at 
M14218_at 
M15395 at 



HUGeneFL M16591 s_at 
HUGeneFL M172?9 at 



HUGeneFL 
HUGeneFL 
HUGeneFL 



84 HUGeneFL 

85 HUGeneFL 

86 HUGeneFL 



M20530 at 
M23178_s~at 
M28130_ma1 
s at 

M29550~at 

M31165_at 
M32011_at 



87 HUGeneFL M33195_at 



88 HUGeneFL 

89 HUGeneFL 



168 




168 


Hs.421986 


168 


Hs.1 12341 


168 


Hs.78944 


168 


Hs.65029 


168 


Hs.437710 


168 


Hs.750 


168 


Hs.421597 


168 


Hs.1 95825 


168 


Hs.1 70087 


168 


Hs.75608 


168 


Hs.202097 


168 


Hs.30212 


168 


Hs.387667 


168 




168 


Hs.101850 


168 


Hs.283393 


168 


Hs.300772 


168 


Hs.442047 


168 


Hs.375957 



cytochrome P450, family 4, subfamily B, stage 
polypeptide 1 

secreted protein, acidic, cysteine-rich (os- stage 

teonectin) 

— stage 

— stage 

transforming growth factor, beta 3 stage 
platelet-derived growth factor receptor, beta stage 

polypeptide 

— stage 

integrin, alpha M (complement component stage 
receptor 3, alpha; also known as CD1 1b 
(p170), macrophage antigen alpha polypep- 
tide) 

carbonyl reductase 1 stage 
electron-transfer-flavoprotein, alpha polypep- stage 
tide (giutarfc aciduria II) 
chemokine (C-C motif) ligand 4 



168 Hs.89555 
168 Hs.203862 



168 
168 
166 



Hs.73817 



168 Hs.1 87543 



168 
168 



Hs.407546 
Hs.949 



168 Hs.433300 



stage 
stage 

stage 

stage 

stage 
stage 



M37033_at 
M37766_at 



168 
168 



Hs.443057 
Hs.901 



Fc fragment of IgG, low affinity Ilia, receptor 

for(CD16) 

lectin, galactoside-binding, soluble, 1 (galectin 

D 

aspartyl-tRNA synthetase 
matrix metalloproteinase 9 (gelatinase B, 
92kDa gelatinase, 92kDa type IV collagenase) 
polymerase (RNA) II (DNA directed) polypep- stage 

tide C, 33kDa 

serine (or cysteine) proteinase Inhibitor, clade stage 
A (alpha-1 antiprotelnase, antitrypsin), mem- 
ber 1 

— stage 

chemokine (C-X-C motif) receptor 4 stage 
protease inhibitor 3, skin-derived (SKALP) stage 
regulator of G-protein signalling 2, 24kDa stage 
growth arrest-specific 1 stage 
growth arrest-specific 6 stage 
fibrillin 1 (Marfan syndrome) stage 
von Hippei-Undau syndrome stage 
RNA binding protein with multiple splicing stage 
aryl hydrocarbon receptor stage 
tight junction protein 2 (zona occludens 2) stage 
procollagen C-endopeptidase enhancer stage 
thyroid receptor interacting protein 1 5 stage 
peroxisome proliferative activated receptor, stage 

gamma 



retinol binding protein 1, cellular stage 

collagen, type V, alpha 2 stage 

tropomyosin 2 (beta) stage 

argininosuccinate lyase stage 

integrin, beta 2 (antigen CD18 (p95), lympho- stage 
cyte function-associated antigen 1 ; macro- 
phage antigen 1 (mac-1) beta subunit) 

hemopoietic cell kinase stage 

guanine nucleotide binding protein (G protein). stage 
alpha inhibiting activity polypeptide 1 

— stage 

chemokine (C-C motif) ligand 3 stage 



protein phosphatase 3 (formerly 2B), catalytic 
subunit, beta isoform (calcineurin A beta) 
tumor necrosis factor, alpha-induced protein 6 
neutrophil cytosolic factor 2 (65kDa, chronic 
granulomatous disease, autosomal 2) 
Fc fragment of IgE, high affinity I, receptor for; 

gamma polypeptide 
CD53 antigen 
CD48 antigen (B-cell membrane protein) 



stage 
stage 



stage 
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HUGeneFL 


M55998 s at 
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Hs.1 72928 


91 


HUGeneFL 


M57731~s at 


168 


Hs.75765 


92 


HUGeneFL 


M62840 at 


168 


Hs.82542 


93 


HUGeneFL 


M63262_at 


168 




94 


HUGeneFL 

1 IVWOI Ivl 


M 88840 at 


1fift 
IDO 


Wc 1 ft una 
ns. i ooiuy 


95 


HUGeneFL 


M69203 s at 


ICQ 
IDO 


ns./o/uo 


96 


HUGeneFL 


M72885 ma1 

C 9# 


1 DO 




97 


HUGeneFL 


O ql 

M77349 at 


18ft 

1 DO 


na.*f£ l*tyo 


98 


HUGeneFL 


M82882_at 


168 


Hs.1 24030 


99 


HUGeneFL 


M83822 at 


IOO 




100 


HUGeneFL 


MQ2Q14 at 


18ft 


Me 410n*l7 


101 


HUGeneFL 


M95178 at 


168 


Hs.1 19000 


102 


HUGeneFL 


S69115 at 


168 


Hs.1 0306 


103 


HUGeneFL 


S77393 at 


168 


Hs.145754 


104 


HUGeneFL 


S78187 at 


168 


Hs.1 53752 


105 


HUGeneFL 


U01833 at 

i www ql 


18A 


Mc ft A A fiQ 


106 


HUGeneFL 


U07231 at 


168 


Hs.309763 


107 


HUGeneFL 


U09278 at 


168 


Hs.436852 


108 


HUGeneFL 


U09937_ma1 


168 


— 






_s at 






109 


HUGeneFL 


U10550__at 


168 


Hs.79022 


110 


HUGeneFL 


U12424_s_at 


168 


Hs.108646 


111 


HUGeneFL 


U163Q8 at 


1 Aft 
IOO 


Mc A'XAAtkSk 

ns.4o44oo 


112 


HUGeneFL 


U20158_at 


168 


Hs.2488 


113 


HUGeneFL 


U20536 <s at 


IOO 


Uo OOflft 

HS.o^oQ 


114 


HUGeneFL 


U24266 at 


IOO 


Mc 77/t/iQ 
MS. / f *f40 


115 


HUGeneFL 


U2824Q at 


4 Aft 
IDO 


MS. oU 1350 


116 


HUGeneFL 


U 28488 «s at 


A CO 

IOO 


ns.i 05935 


117 


HUGeneFL 


U29680~at 


168 


Hs.227817 


118 


HUGeneFL 


U37143_at 


168 


Hs.152096 


119 


HUGeneFL 


U38864 at 


168 


Hs. 1081 39 


120 


HUGeneFL 


U39840_at 


168 


Hs. 163484 


121 


HUGeneFL 


U41315 ma1 


18A 
TOO 








s at 






122 


HUGeneFL 


U44111 _ at 


168 


Hs.42151 


123 


HUGeneFL 


U47414 at 


168 


Hs.13291 


124 


HUGeneFL 


U49352 at 


168 


Hs.414754 


125 


HUGeneFL 


U50708_at 


166 


Hs.1265 


126 


HUGeneFL 


U52101 at 


TOO 


ns.yyyy 


127 


HUGeneFL 


U59914_at 


168 


Hs.1 53863 


128 


HUGeneFL 


U60205 at 


168 


Hs.393239 


129 


HUGeneFL 


U61981 at 


168 


Hs.42674 


130 


HUGeneFL 


U64520_at 


168 


Hs.66708 


131 


HUGeneFL 


UGB0Q1 at 


IOO 


LJ 0 OOA74 

nS.oZOTI 


132 


HUGeneFL 


U66619_at 


168 


Hs.444445 


133 


HUGeneFL 


U68019_at 


168 


Hs.288261 


134 


HUGeneFL 


U68385_at 


168 


Hs.380923 


135 


HUGeneFL 


U68485 at 


168 


Hs.1 931 63 


136 


HUGeneFL 


U74324 at 


168 


Hs.90875 


137 


HUGeneFL 


U77970 at 


168 


Hs.321164 


136 


HUGeneFL 


U83303 cds2 


168 


Hs.164021 


139 


HUGeneFL 


at 

U88871 at 


168 


Hs.79993 



collagen, type I, alpha 1 stage 

chemokine (C-X-C motif) llgand 2 stage 

acyloxyacyt hydrolase (neutrophil) stage 

— stage 

monoamine oxidase A stage 

chemokine (C-C motif) ligand 4 stage 



transforming growth factor, beta-induced. stage 

68kDa 

E74-like factor 1 (ets domain transcription stage 

factor) 

LPS-responsive vesicle trafficking, beach and stage 
anchor containing 

connective tissue growth factor stage 

actinin, alpha 1 stage 

natural killer cell group 7 sequence stage 

Kruppel-like factor 3 (basic) stage 

cell division cycle 25B stage 

nucleotide binding protein 1 (MinD homolog, stage 

E. coli) 

G-rich RIMA sequence binding factor 1 stage 

fibroblast activation protein, alpha stage 



GTP binding protein overexpressed in skeletal stage 

muscle 

glycerol-3-phosphate dehydrogenase 2 (mito- stage 

chondrial) 

chondroitin sulfate proteoglycan 2 (versican) stage 
lymphocyte cytosolic protein 2 (SH2 domain stage 

containing leukocyte protein of 76kDa) 
caspase 6, apoptosis-related cysteine prate- stage 

ase 

aldehyde dehydrogenase 4 family, member stage 

A1 

FXYD domain containing ion transport regula- stage 

tor3 

complement component 3a receptor 1 stage 
BCL2-related protein A1 stage 
cytochrome P450, family 2. subfamily J, poly- stage 

peptide 2 
zinc finger protein 212 stage 
forkhead box A1 stage 



histamine N-methyl transferase stage 

cyclin G2 stage 

2,4-dienoyl CoA reductase 1 , mitochondrial stage 

branched chain keto acid dehydrogenase E1, stage 
beta polypeptide (maple syrup urine disease) 

epithelial membrane protein 3 stage 

MAD, mothers against decapentaplegic ho- stage 
molog 6 (Drosophila) 

sterol-C4-methyl oxidase-like stage 

mutS homolog 3 (E. coli) stage 

vesicle-associated membrane protein 3 (eel- stage 

lubrevin) 

Cbp/p300-interacting transactivator, with stage 
Glu/Asp-rich carboxy-terminal domain, 2 

SWI/SNF related, matrix associated, actin stage 
dependent regulator of chromatin, subfamily 

d, member 3 

MAD, mothers against decapentaplegic ho- stage 

molog 3 (Drosophila) 

likely ortholog of mouse myeloid ecotropic stage 
viral integration site-related gene 2 

bridging integrator 1 stage 

RAB interacting factor stage 

neuronal PAS domain protein 2 stage 

chemokine (C-X-C motif) ligand 6 (granulocyte stage 
chemotactic protein 2) 

peroxisomal biogenesis factor 7 stage 
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metailothionein 2A 
metallothionein 2A 
fibronectin 1 
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granulomatous disease) 
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acetyl-Coenzyme A a cyl transferase 1 (perox- 
isomal 3-oxoacyl-Coenzyme A thiolase) 
collagen, type VI, alpha 1 
collagen, type VI, alpha 2 
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chemokine (OX-C motif) ligand 3 



interferon Induced transmembrane protein 2 

(1-8D) 

GATA binding protein 3 
WEE1 homolog (S. pombe) 
integrin, beta 2 (antigen CD18 (p95), lympho- 
cyte function-associated antigen 1; macro- 
phage antigen 1 (mac-1) beta subunit) 
S100 calcium binding protein P 
fibroblast growth factor receptor 1 (fms-related 
tyrosine kinase 2, Pfeiffer syndrome) 
glutamate dehydrogenase 1 
synaptophysin-like protein 
microtubule-associated protein 7 
chloride channel 3 
PTK6 protein tyrosine kinase 6 
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reticulocalbin 2, EF-hand calcium binding 
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collagen, type IV, alpha 6 

homeo box C6 

protease, serine, 1 1 (IGF binding) 

milk fat globule-EGF factor 8 protein 

skeletal muscle and kidney enriched inositol 

phosphatase 
cysteine-rich, angiogenic inducer, 61 

eukaryotic translation Initiation factor 3, sub- 
unit 5 epsilon, 47kDa 
laminin, alpha 3 

acidic (leucine-rich) nuclear phosphoprotein 
32 family, member B 
protein tyrosine phosphatase, receptor type, N 

polypeptide 2 



expressed sequence 
autocrine motility factor receptor 

ieukotriene B4 12-hydroxydehydrogenase 



RNA binding motif protein, X chromosome 

eneraf transcription factor HE, polypeptide 2, 

beta 34kDa 

»,10-methenyltetrahydrof6late synthetase (5- 
formyltetrahydrofolate cyclo-ligase) 
TAF9 RNA polymerase II, TATA box binding 
protein (TBP)-associated factor, 32kDa 
protein tyrosine phosphatase, non-receptor 

type 3 

100 calcium binding protein A12 (calgranulin 

C) 
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small proline-rich protein 2C 
S100 calcium binding protein A7 (psoriasin 1) 
keratin 16 (focal non-epidermolytic palmopian- 
tar keratoderma) 
interleukin 13 receptor, alpha 2 
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matrix metalloprotelnase 1 1 (stromelysln 3) 
NM_003105*:Homo sapiens sortilin-related 
receptor, L(DLR class) A repeats-containing 
(SORL1). mRNA. 
NMjX)3105*:Homo sapiens sortilin-related 
receptor, L(DLR class) A repeats-containing 
(SORL1), mRNA. 
NM_003105*:Homo sapiens sortilin-related 
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receptor, L(DLR class) A repeats-containing 
(SORL1), mRNA. 
sortilin-related receptor. L(DLR class) A re- 
peats-containing (SORL1) 

Target Exon 



NM_007181*:Homo sapiens mitogen- 
activated protein kinase kinase kinase kinase 
1 (MAP4K1), mRNA. 
C6001282:gi|4504223|reflNP_000172.1| 
glucuronidase, beta [Homo sapiens] 
gi|114963|spiP082 
Target Exon 



Target Exon 



NM_022819*:Homo sapiens phosphollpase 
A2, group IIF (PLA2G2F), mRNA. VERSION 
NM_020245.2 Gl 
NM_024408*:Homo sapiens Notch (Droso- 
phila) homolog 2 (NOTCH2), mRNA. VER- 
SION NM_024410.1 Gl 
Insulin-like growth factor 2 (somatomedin A) 

(IGF2) 

NM_021628*:Homo sapiens arachidonate 
lipoxygenase 3 (ALOXE3), mRNA. VERSION 
NM_020229.1 Gl 
NMJ)05569*:Homo sapiens LIM domain 
kinase 2 (LIMK2), transcript variant 2a. 

mRNA. 
Target Exon 



Target Exon 
ESTs 

desmoplakin (DPI, DPII) 

gb:z!73d06.r1 Stratagene colon (937204) 
Homo sapiens cDNA done 5*. mRNA se- 
quence 

methionine adenosyltransferase II, beta 

phosphorytase kinase, alpha 2 (liver) 

DKFZP586H2123 protein 

serine (or cysteine) proteinase inhibitor, clade 
B (ovalbumin), member 5 

zinc finger protein 36 (KOX 18) 
mitogen-actlvated protein kinase kinase 2 
Integrin, alpha 7 
hypothetical protein MGC1 1352 
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sion 

gb:601146990F1 NIH_MGC_19 Homo progres- 
sapiens cDNA clone 5\ mRNA sequence slon 



ESTs progres- 
sion 



RNA binding motif protein, X chromosome progres- 
sion 

collagen, type IV, alpha 2 progres- 
sion 

hypothetical protein FU22479 progres- 
sion 

minichromosome maintenance deficient (S. progres- 

cerevisiae) 7 sion 

KIAA0068 protein progres- 
sion 

halry/enhancer-of-split related with YRPW progres- 

motif-like sion 

heterogeneous nuclear ribonucleoprotein AO progres- 
sion 

Homo sapiens cDNA FU13571 fis, clone progres- 
PLACE1 008405 sion 

polo (DrosophiaHike kinase progres- 
sion 

hypothetical protein FU 13459 progres- 

einn 



sion 



SWI/SNF related, matrix associated, actin prog res- 
dependent regulator of chromatin, subfamily sion 

a, member 4 

neuron-specific protein progres- 
sion 

UDP-N-acetyl-alpha-D- progres- 

galactosamine:polypeptide N- sion 
acetyigatactosaminyltransferase 1 (GalNAc- 

T1) 

hypothetical protein FKSG44 progres- 
sion 

hypoxanthine phosphoribosyltransferase 1 progres- 

(Lesch-Nyhan syndrome) sion 

fragile X mental retardation, autosomal ho- progres- 

molog 1 slon 

CDC20 (cell division cycle 20, S. cerevisiae, progres- 

homolog) sion 

cyclin D1 (PRAD1: parathyroid adenomatosis progres- 

1) sion 

membrane cofactor protein (CD46, tro- progres- 

phoblast-lymphocyte cross-reactive antigen) sion 

KIAA0143 protein progres- 
sion 

Homo sapiens mRNA; cDNA progres- 

DKF2fc564D1462 (from clone sion 
DKFZp564D1462) 

growth factor receptor-bound protein 7 progres- 
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133 Hs.66219 
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133 Hs.367762 

133 Hs.101067 
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133 Hs.283609 
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133 Hs.285641 

133 Hs.301685 

133 Hs.106210 
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133 Hs. 108258 

133 Hs. 110457 

133 Hs. 11 0953 

133 Hs.104520 

133 Hs.300741 
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slon 

hypothetical protein progres- 
sion 

Homo sapiens, clone IMAGE:3355383, progres- 
mRNA, partial cds sion 

PTD011 protein progres- 
sion 

FH1/FH2 domain-containing protein progres- 
sion 

ESTs progres- 
sion 

ESTs. Weakly similar to A47582 B-cell growth progres- 
factor precursor [H.sapiensl slon 

ESTs progres- 
sion 

GCN5 (general control of amino-acid synthe- prog res- 
sis, yeast, homologHike 2 sion 

KIAA0807 protein progres- 
sion 

major histocompatibility complex, class l-like progres- 

sequence sion 

ESTs, Moderately similar to T1 251 2 hypo- progres- 
thetical protein DKF2p434G232.1 [H.saprens] sion 

hypothetical protein PRO2032 progres- 
sion 

HIV-1 inducer of short transcripts binding progres- 
protein; lymphoma related factor sion 

KIAA1111 protein progres- 
sion 

KIAA0620 protein progres- 
sion 

hypothetical protein FU 1081 3 progres- 
sion 

peroxisome proliferative activated receptor, progres- 

delta sion 



fibroblast growth factor receptor 3 (achondro- progres- 
plasia, thanatophoric dwarfism) slon 

actin binding protein; macrophin (mlcrofila- progres- 
ment and actln filament cross-linker protein) sion 

Wolf-Hirschhom syndrome candidate 1 progres- 
sion 

retinoic acid induced 1 progres- 
sion 

Homo sapiens cDNA FU13694 fis. clone progres- 
PLACE2000115 slon 

sordn progres- 
sion 

CG 1-1 8 protein progres- 
sion 
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425380 

426028 
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426468 

426469 
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133 Hs.349256 
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133 Hs.380062 

133 Hs.31731 

133 Hs.1 32955 

133 Hs.136309 

133 Hs.143601 

133 Hs.146580 

133 Hs.1 53752 

133 Hs.153937 

133 Hs.154525 

133 Hs.1 54545 

133 Hs.155106 

133 Hs.155188 

133 Hs.1 55291 

133 Hs.32148 

133 Hs.172028 

133 Hs.1 66994 

133 Hs.1 67700 

133 Hs.28917 

133 Hs.1 17558 

133 Hs.363039 

133 Hs.170171 

133 Hs.2056 



ESTs, Weakly similar to MGB4_HUMAN progres- 
MELANOMA-ASSOCIATED ANTIGEN B4 sion 

[H.sapiens] 

paired immunoglobulin-like receptor beta progres- 
sion 

gb:EST385571 MAGE resequences, MAGM progres- 
Homo sapiens cDNA, mRNA sequence sion 

ornithine decarboxylase antizyme 1 progres- 
sion 

peroxiredoxin 5 progres- 
sion 

BCL2/adenovirus E1B 19kD-interacting pro- progres- 

tein 3-like sion 

SH3-containing protein SH3GLB1 progres- 
sion 

hypothetical protein hCLA-iso progres- 
sion 

enolase 2, (gamma, neuronal) progres- 
sion 

cell division cycle 25B progres- 
sion 

activated p21cdc42Hs kinase progres- 
sion 

KIAA1 076 protein progres- 
sion 

PD2 domain containing guanine nucleotide progres- 
exchange factor(GEF)1 sion 

receptor (calcitonin) activity modifying protein progres- 

2 sion 

TATA box binding protein (TBP>-associated progres- 

factor, RNA polymerase II, F, 55kD sion 

KIAA0005 gene product progres- 
sion 

AD-015 protein progres- 
sion 

a disintegrin and metalloproteinase domain 10 progres- 

(ADAM10) sion 

FAT tumor suppressor (Orosophila) homolog progres- 
sion 

Homo sapiens cDNA FU10174 fis, clone progres- 
HEMBA1 003959 sion 

ESTs progres- 
sion 

ESTs progres- 
sion 

methylmalonate-semialdehyde dehydro- progres- 

genase sion 

glutamate-ammonia ligase (gtutamine syn- progres- 

thase) sion 

UDP glycosyltransferase 1 family, polypeptide pro- 
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428115 
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428712 

428901 

429124 

429187 

429311 

429561 

429802 

429953 

430604 

430677 

430746 

431604 

431842 
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133 Hs.303154 

133 Hs.173091 

133 Hs.356512 

133 Hs. 123253 

133 Hs.284232 

133 Hs.180479 

133 Hs. 180655 

133 Hs.181369 

133 Hs.300855 

133 Hs.183435 

133 Hs.356190 

133 Hs. 190452 

133 Hs.146668 

133 Hs.196914 

133 Hs. 163872 

133 Hs. 198998 

133 Hs.250646 

133 Hs.5367 

133 Hs.226581 

133 Hs.247309 

133 Hs.359784 

133 Hs.406256 

133 Hs.264190 

133 Hs.271473 



A9 

popeye protein 3 
ubiquitin-like 3 
ubiquitin carrier protein 
hypothetical protein FLJ22009 



tumor necrosis factor receptor superfamily, 
member 12 (translocating chain-association 
membrane protein) 
hypothetical protein FU20116 



serine/threonine kinase 12 



ubiquitin fusion degradation 1-iike 



KIAA0977 protein 



NM_004545:Homo sapiens NADH dehydro- 
genase (ubiquinone) 1 beta subcomplex, 1 
(7kD, MNLL) (NDUFB1), mRNA. 

ubiquitin 8 



KIAA0365 gene product 

KIAA1253 protein 

minor histocompatibility antigen HA-1 

ESTs, Weakly similar to S65657 alpha-1C- 
adrenergic receptor splice form 2 [H.sapiens] 

conserved helix-loop-helix ubiquitous kinase 

baculoviral IAP repeat-containing 6 

ESTs, Weakly similar to 1 38022 hypothetical 
protein [H.sapiens] 

COX15 (yeast) homoiog, cytochrome c oxi- 
dase assembly protein 

succinate-CoA ligase, GDP-forming, beta 

subunit 

desmoglein 2 
ESTs 

vacuolar protein sorting 35 (yeast homoiog) 



epithelial protein up-regulated in carcinoma, 
membrane associated protein 17 
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431857 

432258 

432327 

432554 

432864 

433052 

433282 

433844 

433914 

434055 

434263 

434547 

434831 

434978 

435158 

435320 

435521 

436472 

436576 

437223 

437256 

437524 
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133 Hs.271742 

133 Hs.293039 

133 Hs.274363 

133 Hs.278411 

133 Hs. 359682 

133 Hs.293003 

133 Hs.49007 

133 Hs.179647 

133 Hs.1 12160 

133 Hs.3726 

133 Hs.79187 

133 Hs.106124 

133 Hs.273397 

133 Hs.4310 

133 Hs.65588 

133 Hs.1 17864 

133 Hs.6361 

133 Hs.46366 

133 Hs.77542 

133 Hs.330716 

133 Hs.97871 

133 Hs.385719 

133 Hs.1 5670 

133 Hs.1 29037 



ADP-ribosyltransferase (NAD; poly (ADP- 
ribose) polymerase )-!ike 3 

ESTs 



neuroglobin 
NCK-assoclated protein 1 
calpastatin 

ESTs, Weakly similar to PC4259 ferritin asso- 
ciated protein [H.sapiens] 

hypothetical protein 

Homo sapiens cDNA FU12195 fis, clone 
MAM MA1 000865 

Homo sapiens DNA helicase homolog (PIF1) 
mRNA, partial cds 

x 003 protein 
ESTs 
ESTs 

KIAA0710 gene product 
eukaryotlc translation initiation factor 1 A 
DAZ associated protein 1 
ESTs 



mitogen-activated protein kinase kinase 1 
interacting protein 1 

KIAA0948 protein 



ESTs 



Homo sapiens cDNA FU 14368 fis. clone 
HEMBA1001122 

Homo sapiens, clone IMAGE :3845253, 
mRNA, partial cds 

ESTs 



ESTs 
ESTs 
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443679 
443893 
444037 
444312 
444336 
444604 
445084 
445462 
445692 
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133 Hs.30738 

133 Hs.6451 

133 Hs.75216 

133 Hs.375195 

133 Hs.350547 

133 Hs.334437 

133 Hs.6856 

133 Hs.158549 

133 Hs.317714 

133 Hs.20950 

133 Hs.132545 

133 Hs.8148 

133 Hs.8375 

133 Hs.348514 

133 Hs.398102 

133 Hs.9670 

133 Hs.1 15472 

133 Hs.380932 

133 Hs.351142 

133 Hs. 10882 

133 Hs.11441 

133 Hs.250848 

133 Hs.288649 

133 Hs.182099 



ESTs 



PRO0659 protein 



Homo sapiens cDNA FU13713 fis, clone 
PLACE2000398, moderately similar to LAR 
PROTEIN PRECURSOR (LEUKOCYTE 
ANTIGEN RELATED) (EC 3.1.3.48) 

ESTs 



progres- 
sion 



progres- 
sion 

progres- 
sion 



nuclear receptor co-repressor/HDAC3 com- 
plex subunit 

hypothetical protein MGC4248 



ash2 (absent, small, or homeotic, Drosophila, 

homo1og)-like 

ESTs, Weakly similar to T2D3 HUMAN 
TRANSCRIPTION INITIATION FACTOR 
TFIID 135 KDA SUBUNIT [H.sapiens] 
pallid (mouse) homolog, pailidin 



phospholysine phosphohistidine inorganic 
pyrophosphate phosphatase 

ESTs 



selenoprotein T 
TNF receptor-associated factor 4 



ESTs, Moderately similar to 2109260A B cell 
growth factor [Ksapiens] 

Homo sapiens clone FLB3442 PRO0872 
mRNA, complete cds 

hypothetical protein FU10948 



ESTs, Weakly similar to 2004399A chromo- 
somal protein (H.sapiensJ 

CHMP1.5 protein 
ESTs 

HMG-box containing protein 1 
chromosome 1 open reading frame 8 
hypothetical protein FU 14761 
hypothetical protein MGC3077 
ESTs 
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445831 
446556 
446847 
447343 
447400 
448357 
448524 
448625 
448780 



448813 
449268 
449626 
450893 
450997 
451164 
451225 
451867 
451970 
452012 
452170 
452517 
452829 
452929 
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133 
133 



133 
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Hs.13351 
Hs. 15303 
Hs.82845 

Hs.236894 
Hs. 18457 

Hs.108923 
Hs.21356 

Hs.178470 

Hs.267749 



Hs.22142 
Hs.23412 
Hs. 112860 
Hs.25625 
Hs.35254 
Hs.60659 
Hs.57655 
Hs.27192 
Hs.211046 
Hs.279766 
Hs.28285 

Hs.63368 
Hs.172816 
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LanC (bacterial (antibiotic synthetase compo progres- 

nent C)-Hke 1 sion 

KIAA0349 protein progres- 
sion 

Homo sapiens cDNA: FU21930 fis, clone progres- 

HEP04301 , highly similar to HSU9091 6 Hu- sion 
man clone 23815 mRNA sequence 

ESTs, Highly similar to S02392 alpha-2- progres- 

macrogtobulin receptor precursor [H.sapiens] sion 

hypothetical protein FLJ2031 5 progres- 
sion 

RAB38, member RAS oncogene family progres- 
sion 

hypothetical protein DKFZp762K2015 progres- 
sion 

hypothetical protein FLJ22662 progres- 
sion 

Human DNA sequence from clone 366N23 on progres- 
chromosome 6q27. Contains two genes siml- sion 
lar to consecutive parts of the C. elegans 
UNC-93 (protein 1, C46F1 1.1) gene, a 
KIAA0173 and Tubulin-Tyrosine Ligase LIKE 
gene, a Mitotic Feedback Control Protein 

MADP2H 

cytochrome b5 reductase b5R.2 progres- 
sion 

hypothetical protein FLJ20160 progres- 
sion 

zinc finger protein 258 progres- 
sion 

hypothetical protein FU 1 1 323 progres- 
sion 

hypothetical protein FLB6421 progres- 
sion 

ESTs. Weakly similar to T46471 hypothetical progres- 
protein DKFZp434L0130.1 [H.sapiens] sion 

ESTs progres- 
sion 

hypothetical protein dJ1057B20.2 progres- 
sion 

ESTs progres- 
sion 

kinesln family member 4A progres- 
sion 

patched related protein translocated in renal progres- 

cancer sion 

gb:RC-BT068-130399-068BT068Homo progres- 
sapiens cDNA, mRNA sequence sion 

ESTs, Weakly similar to TRHYJHUMAN progres- 
TRICHOHYALI [H.sapiens] sion 

neuregulin 1 progres- 
sion 
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441 EOS Hu03 
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443 EOS Hu03 



444 EOS Hu03 



445 EOS Hu03 



446 EOS Hu03 



453395 



454639 



456332 



457228 



458132 



408668 



410691 



420269 



422119 



422765 



422984 



428016 



437325 



444773 



445926 



452714 



452866 



453963 



457329 



447 U133A 200600 at 



448 



449 
450 



451 
452 

453 

454 



U133A 



200762 at 



U133A 201088_at 
U133A 201291 s at 



U133A 201560 at 

U133A 201616_slat 

U133A 201641_at 

U133A 201744_s_at 
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133 Hs.377915 
133 

133 Hs.399939 

133 Hs. 195471 

133 Hs. 103267 

133 Hs.152925 

133 Hs.65450 

133 Hs.96264 

133 Hs.1 11862 

133 Hs.1578 

133 Hs.351597 

133 Hs.181461 

133 Hs.5548 

133 Hs.11923 

133 Hs.334826 

133 Hs.30340 

133 Hs.268016 

133 Hs 

133 Hs.359682 

168 Hs. 170328 

168 Hs. 173381 



mannosidase, alpha, class 2A, member 1 



gb:RC2-ST0158-091099-011-d05 ST0158 
Homo sapiens cDNA, mRNA sequence 

gb:nc39d05.r1 NCI_CGAP_Pr2 Homo sapiens 
cDNA clone, mRNA sequence 

Human cosmid CRI-JC2015 at D10S289 In 

10sp13 

hypothetical protein FLJ22548 similar to gene 

trap PAT 12 

KIAA1268 protein 



reticulon 4 



alpha thalassemia/mental retardation syn- 
drome X-linked (RAD54 (S. cerevisiae) ho- 

motog) 

KIAA0590 gene product 



baculoviral IAP repeat-containing 5 (survivin) 

ESTs 

ariadne homolog, ubiquitin-conjugating en- 
zyme E2 binding protein, 1 (Drosophila) 

F-box and leucine-rich repeat protein 5 
hypothetical protein DJ167A19.1 
splicing factor 3b, subunit 1, 155kDa 



KIAA1 165: likely ortholog of mouse Nedd4 
WW domain-binding protein 5A 

ESTs 



26959 cDNA FU3651 3 fis, clone TRACH2001 523 



calpastatin 



NM_001910; cathepsin E isoform a prepropro- 
tein NNM48964; cathepsin E isoform b pre- 

proprotein 

NM_0 19894; transmembrane protease, serine 
4 isoform 1 NIVM 83247; transmembrane 
protease, serine 4 isoform 2 
168 Hs.159557 NM_000228; laminin subunit beta 3 precursor 
1 68 Hs. 1 56346 NM_030570; uroplakin 3B isoform a 

NM_1 82683; uroplakin 3B isoform c 
NM_1 82684; uroplakin 3B isoform b 
168 Hs.25035 NM_005547; involucrin 

168 Hs.443811 NM_004692; NM_032727; intemexin neu- 
ronal intermediate filament protein, alpha 
168 Hs.118110 NMJ>16233; peptidylarginine deiminase type 

III 

1 68 Hs.406475 NM_01441 7; BCL2 binding component 3 
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Hs.76224 
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Hs.1908 
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Hs.1908 
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Hs.17109 
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Hs.416073 
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Hs. 155048 
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Hs.18141 


168 


Hs.409034 
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Hs.76422 
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Hs.75268 
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Hs.371617 
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Hs. 172740 
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Hs.391561 
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Hs.300701 
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Hs.76888 
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Hs.279916 
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Hs.409798 
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Hs.377028 
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Hs.85266 
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Hs.152096 
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Hs. 155597 
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Hs.290432 
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Hs.2942 
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Hs.1355 
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Hs.95582 


168 


Hs.47042 


168 


Hs.82547 
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Hs.83760 


168 


Hs.277543 


168 


Hs.1 16724 


168 


Hs.284211 



168 
168 
168 
168 
168 
168 
168 

168 
168 
168 

168 

168 



Hs.443435 
Hs.379613 
Hs.505407 
Hs.436983 

Hs.21293 
Hs.170195 

Hs.85201 

Hs.188401 
Hs.5338 
Hs.86859 

Hs.82237 



NMJ)20142; NADH:ubiqulnone oxidoreduo CIS 
tase MLRQ subunit homolog 
NM_018058; cartilage acidic protein 1 CIS 
NMJJ00497; cytochrome P450. subfamily XIB CIS 
(steroid 11-beta-hydroxylase), polypeptide 1 

precursor 
NM_007193; annexin A10 CIS 
NM_001958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NM_005581; Lutheran blood group (Auberger CIS 

b antigen included) 
NM__005581; Lutheran blood group (Auberger CIS 
b antigen included) 
NM_030570; uroplakin 3B isoform a CIS 
NMJI82683; uroplakin 3B isoform c 
NM_1 82684; uroplakin 3B isoform b 
NM_000300; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 
NM_0071 93; annexin A1 0 CIS 
NM_007144; ring finger protein 1 10 CIS 
NM_014417; BCL2 binding component 3 CIS 
NM_001442; fatty acid binding protein 4, CIS 

adipocyte 

NM_017689; hypothetical protein FU20151 CIS 
NM_007144; ling finger protein 110 CIS 
NM_004692; NM_032727; internexin neu- CIS 
ronai intermediate filament protein, alpha 
NM_001248; ectonucleoside triphosphate CIS 
diphosphohydrolase 3 
NM_017689; hypothetical protein FU20151 CIS 
NM_001 958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NMJJ16233; peptidylarginine deiminase type CIS 

III 

NM_000445; plectin 1 , intermediate filament CIS 
binding protein 500kDa 
NM_000213; integrin, beta 4 CIS 
NM__019894; transmembrane protease, serine CIS 
4 isoform 1 NM_1 83247; transmembrane 
protease, serine 4 isoform 2 
NM_000213; integrin, beta 4 CIS 
NMJ)02145; homeo box B2 CIS 
NM.006760; uroplakin 2 CIS 
NM_001910; cathepsin E isoform a prepropro- CIS 
tein NM_1 48964; cathepsin E isoform b pre- 

proprotein 
NMJ506942; SRY-box 15 CIS 
NM_001248; ectonucleoside triphosphate CIS 
diphosphohydrolase 3 
NM_005522; homeobox A1 protein isoform a CIS 
NM_1 53620; homeobox A1 protein isoform b 

NM_003282; troponin I, skeletal, fast CIS 
NM 015162; lipidosin CIS 
NM.015162; lipidosin CIS 
NM_030570; uroplakin 3B isoform a CIS 
NMjl 82683; uroplakin 3B isoform c 
NM_182684; uroplakin 3B isoform b 

NM_000213; integrin. beta 4 CIS 
NM_006760; uroplakin 2 CIS 
NM_015162; lipidosin CIS 
NM_000228; laminin subunit beta 3 precursor CIS 
NM_007144; ring finger protein 110 CIS 
NMJ300228; laminin subunit beta 3 precursor CIS 
NMJ301 248; ectonucleoside triphosphate CIS 
diphosphohydrolase 3 
NM_007193; annexin A10 CIS 
NM_01 7689; hypothetical protein FU201 51 CIS 
NM_020142; NADH ubiquinone oxidoreduo CIS 
tase MLRQ subunit homolog 
NM_001958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NM_000300; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 



# 



CT7DK2003/000750 



26 



501 



502 
503 
504 

505 
506 

507 

508 
509 
510 
511 



512 
513 

514 
515 
516 



517 

518 
519 

520 
521 
522 

523 
524 

525 
526 
527 
528 

529 
530 



531 
532 
533 

534 

535 

536 

537 

538 
539 

540 

541 

542 
543 

544 



U133A 
U133A 

U133A 

U133A 
U133A 



U133A 
U133A 



U133A 



U133A 
U133A 



U133A 
U133A 
U133A 



U133A 
U133A 

U133A 

U133A 

U133A 
U133A 



211430_s_at 


168 


Hs.413826 


211671 s at 
211692 s at 
211896_s_at 


168 
168 
168 


Hs.126608 
Hs.87246 
Hs.156316 


212077 at 
212192_at 


168 
168 


Hs.443811 
Hs. 109438 


-.1- 

Z iZ195__at 


168 


Hs.71968 


212386 at 
212667~at 
212671 s at 
21299o\jTat 


168 
168 
168 
168 


Hs.359289 
Hs.1 11779 
Hs.387679 
Hs.375115 


213891 s at 
213975_s_at 


168 
168 


Hs.359289 
Hs.234734 


214352 s at 
214599 at 
214630_at 


168 
168 
168 


Hs.412107 
Hs.157091 
Hs.184927 


214639_s_at 


168 


Hs.67397 


214651 s at 
214669_x_at 


168 
168 


Hs. 127428 
Hs.377975 


214677_x_at 
Z14/OZ x at 
215076_slat 


168 
168 
168 


Hs.449601 
Hs. 195464 
Hs.443625 


215121 x at 
215176_x_at 


168 
168 


Hs.356861 
Hs.503443 


215379 x at 
215812_s_at 
2ioo4i s at 
216971_s_at 


168 
168 
168 
168 


Hs.449601 
Hs.499113 
Hs.18141 
Hs.79706 


217028_at 
zi /U4U_x_at 


168 
168 


Hs.421986 
Hs.95582 


217388 s at 
217626 at 
218484_at 


168 
168 
168 


Hs.444471 
Hs.201967 
Hs.221447 


Z 1 oooo__s_a t 


168 


Hs.93765 


218718__at 


168 


Hs.43080 


218918_at 


168 


Hs.8910 


218960_at 


168 


Hs.414005 


219410_at 


168 


Hs. 104800 


219922_s_at 


168 


Hs.289019 


220026_at 


168 


Hs.227059 


220779_at 


168 


Hs.149195 


221204 s at 
221660_at 


168 
168 


Hs.326444 
Hs.247831 


22l671_x_at 


168 


Hs.377975 



NM_001 91 0; cathepsin E isoform a prepropro CIS 
tein NIvM 48964; cathepsin E isoform b pre- 

proprotein 

NM_007144; ring finger protein 110 CIS 
NMJM4417; BCL2 binding component 3 CIS 
NM_005581 ; Lutheran blood group (Auberger CIS 
b antigen included) 
NMJ303282; troponin I, skeletal, fast CIS 
NM_020142; NADH: ubiquinone oxidoreduc- CIS 

tase MLRQ subunit homolog 
NM_000445; plectin 1 , intermediate filament CIS 
binding protein 500kDa 
NM_005547; involucrin CIS 
NMJ)00299; plakophilin 1 CIS 
NMJ)02145; homeo box B2 CIS 
NM_000497; cytochrome P450, subfamily XIB CiS 
(steroid 11-beta-hydroxylase), polypeptide 1 

precursor 
NM_007193; annexin A10 CIS 
NM_005522; homeobox A1 protein isoform a CIS 
NM_1 53620; homeobox A1 protein isoform b 

NM_006760; uroplakin 2 CIS 
NM J)05547; involucrin CIS 
NM_000497; cytochrome P450, subfamily XIB CIS 
(steroid 11-beta-hydroxylase), polypeptide 1 

precursor 

NM_005522; homeobox A1 protein isoform a CIS 
NM_1 53620; homeobox A1 protein isoform b 

NM_002145; homeo box B2 CIS 
NM_001442; fatty acid binding protein 4, CIS 

adipocyte 
NM.006942; SRY-box 15 CIS 
NMJ306942; SRY-box 15 CIS 
NM_016233; peptidylarglnine delminase type CIS 

111 

NlvM)1 8058; cartilage acidic protein 1 CIS 
NM _001 248; ectonucleoslde triphosphate CIS 
diphosphohydrolase 3 
NM_006760; uroplakin 2 CIS 
NMJ)18058; cartilage acidic protein 1 CIS 
NM_005547; involucrin CIS 
NM_000445; plectin 1 , intermediate filament CIS 
binding protein 500kDa 
NMJ)03282; troponin I, skeletal, fast CIS 
NM_001 91 0; cathepsin E isoform a prepropro- CIS 
tein NNM48964; cathepsin E isoform b pre- 

proprotein 

NM_000228; laminin subunit beta 3 precursor CIS 
NM.000299; plakophilin 1 CIS 
NM_020142; NADH:ubiquinone oxidoreduc- CIS 
tase MLRQ subunit homolog 
NMJ)01442; fatty acid binding protein 4, CIS 

adipocyte 

NM_000445; plectin 1 , intermediate filament CIS 
binding protein 500kDa 
NM_0003Q0; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 
NMJJ1 9894; transmembrane protease, serine CIS 
4 Isoform 1 NM_1 83247; transmembrane 
protease, serine 4 isoform 2 
NMJHM692; NM.032727; intemexin neu- CIS 
ronal Intennediate filament protein, alpha 

NM_030570; uroplakin 3B isoform a CIS 
NM_182683; uroplakin 3B Isoform c 
NM_1 82684; uroplakin 3B Isoform b 
NM_001442; fatty acid binding protein 4, CIS 

adipocyte 

NM_01 6233; peptidylarginine deiminase type CIS 

III 

NM_018058; cartilage acidic protein 1 CIS 
NM_000300; phospholipase A2, group IIA CIS 
(platelets, synovial fluid) 
NM_000299; plakophilin 1 CIS 
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NMJ)00299; plakophilin 1 CIS 
NM_001 958; eukaryotic translation elongation CIS 

factor 1 alpha 2 
NMJ)05625; syndecan binding protein CIS 
(syntenin) 

NMJ502719; gamma isoform of regulatory CIS 
subunit B56, protein phosphatase 2A isoform 
a NM_1 78586; gamma Isoform of regulatory 
subunit B56, protein phosphatase 2A isoform 
b NM_178587; gamma isoform of regulatory 
subunit B56, protein phosphatase 2A isoform 
c NNM 78588; gamma isoform of regulatory 
subunit B56, protein phosphatase 2A isoform 

d 

NM_001560; interieukln 13 receptor, alpha 1 CIS 

precursor 

NM_001 1 66; baculoviral IAP repeat- CIS 
containing protein 2 
NM_007373; soc-2 suppressor of clear ho- CIS 

molog 

NMJJ03563; speckle-type POZ protein CIS 
NM_01 21 61 ; F-box and leucine-rich repeat CIS 
protein 5 isoform 1 NM_033535; F-box and 
leucine-rich repeat protein 5 isoform 2 
NM_015716; misshapen/NIK-related kinase CIS 
isoform 1 NM_1 53827; misshapen/NIK-related 
kinase isoform 3 NM_1 70663; mis- 
shapen/NIK-related kinase isoform 2 
NM_003925; methyt-CpG binding domain CIS 

protein 4 

NMJ>12164; F-box and WD-40 domain pro- CIS 

teln2 

NMJ>15125; capicua homolog CIS 

CIS 

NM J)1 5076; cydin-dependent kinase (CDC2- CIS 

like) 11 

NMJM8957; SH3-domain binding protein 1 CIS 
NMJJ18695; erbb2 interacting protein CIS 
NM_012097; ADP-ribosylation factor-like 5 CIS 
isoform 1 NMJI 77985; ADP-ribosylation 
factor-like 5 isoform 2 



5 



The expression level of at least one gene in the sample is determined, wherein at least one 
of said genes is selected from the genes of Table A. The samples according to the present 
invention may be any tissue sample or body fluid sample, it is however often preferred to 
conduct the methods according to the invention on epithelial tissue, such as epithelial tissue 
from the bladder. In particular the epithelial tissue may be mucosa. In another embodiment 
the sample is a urine sample comprising the tissue cells. 



10 



The sample may be obtained by any suitable manner known to the man skilled in the art, 
such as a biopsy of the tissue, or a superficial sample scraped from the tissue. The sample 
may be prepared by forming a cell suspension made from the tissue, or by obtaining an ex- 
tract from the tissue. 



15 



In one embodiment it is preferred that the sample comprises substantially only cells from 
said tissue, such as substantially only cells from mucosa of the bladder. 
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The methods according to the invention may be used for determining any biological condi- 
tion, wherein said condition leads to a change in the expression of at least one gene, and 
preferably a change in a variety of genes. 



5 Thus, the biological condition may be any malignant or premalignant condition, in particular 
in bladder, such as a tumor or an adenocarcinoma, a carcinoma, a teratoma, a sarcoma, 
and/or a lymphoma, and/or carcinoma-in-situ, and/or dysplasia-in-situ. 

The expression level may be determined as single gene approaches, i.e. wherein the deter- 
10 mination of expression from one or two or a few genes is conducted. It is however preferred 
that information is obtained from several genes, so that an expression pattern is obtained. 

In a preferred embodiment expression from at least one gene from a first group is deter- 
mined, said first gene group representing genes being expressed at a higher level in one 
15 type of tissue, i.e. tissue in one stage or one risk group, in combination with determination of 
expression of at least one gene from a second group, said second group representing genes 
being expressed at a higher level in tissue from another stage or from another risk group. 
Thereby the validity of the prediction increases, since expression levels from genes from 
more than one group are determined. 



20 



25 



30 



However, determination of the expression of a single gene whether belonging to the first 
group or second group is also within the scope of the present invention. In this case it is pre- 
ferred that the single gene is selected among genes having a high change in expression 
level from normal cells to biological condition cells. 

Another approach is determination of an expression pattern from a variety of genes, wherein 
the determination of the biological condition in the tissue relies on information from a variety 
of gene expression, i.e. rather on the combination of expressed genes than on the informa- 
tion from single genes. 



The following data presented herein relates to bladder tumors, and therefore the description 
has focused on the gene expression level as one way of identifying genes that lose or gain 
function in cancer tissue. Genes showing a remarkable downregulation (or complete loss) or 
upregulation (gene expression gained de novo) of the expression level - measured as the 
35 mRNA transcript, during the malignant progression in bladder from normal mucosa through 
Ta superficial tumors, and Carcinoa in situ (CIS) to T1, slightly invasive tumors, to T2, T3 
and T4 which have spread to muscle or even further into lymph nodes or other organs are 
within the scope of the invention, as well as genes gaining importance during the differentia- 
tion from normal towards malignancy. 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 CT/DK2003/000750 

29 



The present invention relates to a variety of genes identified either by an EST identification 
number and/or by a gene identification number. Both type of identification numbers relates to 
identification numbers of UniGene database, NCBI, build 18. 

5 

The various genes have been identified using Affymetrix arrays of the following product 
numbers: 

HUGeneFL (sold in 2000-2002) 
EOS Hu03 (customized Affymetric array) 
10 U1 33A (product #900367 sold in 2003) 

Stage of a bladder tumor indicates how deep the tumor has penetrated. Superficial tumors 
are termed Ta, and Carcinoma in situ (CIS), and T1, T2, T3 and T4 are used to describe 
increasing degrees of penetration into the muscle. The grade of a bladder tumor is 

15 expressed on a scale of l-IV (1^) according to Bergkvist, A; ljungquist. A.; Moberger, 
B. "Classification of bladder tumours basedf on the cellular pattern. Preliminary report of a 
clinical-pathological study of 300 cases with a minimum follow-up of eight years", Acta Chir 
Scand., 1965, 130(4):371-8). The grade reflects the cytological appearance of the cells. 
Grade I cells are almost normal. Grade II cells are slightly deviant. Grade III cells are clearly 

20 abnormal. And Grade IV cells are highly abnormal. A special form of bladder malignancy is 
carcinoma-in-situ or dyplasia-in-situ in which the altered cells are located in-situ. 

It is important to predict the prognosis of a cancer disease, as superficial tumors may require 
a less intensive treatment than invasive tumors. According to the invention the expression 

25 level of genes may be used to identify genes whose expression can be used to identify a 
certain stage and/or the prognosis of the disease. These "Classifiers" are divided into those 
which can be used to identify Ta, Carcinoma in situ (CIS), T1, and T2 stages as well as 
those identifying risk of recurrence or progression. In one aspect of the invention measuring 
the transcript level of one or more of these genes may lead to a classifier that can add sup- 

30 plementary information to the information obtained from the pathological classification. For 
example gene expression levels that signify a T2 stage will be unfavourable to detect in a Ta 
tumor, as they may signal that the Ta tumor has the potential to become a T2 tumor. The 
opposite is probably also true, that an expression level that signify Ta will be favorable to 
have in a T2 tumor. In that way independent information may be obtained from pathological 

35 classification and a classification based on gene expression levels is made. 

In the present context a standard expression level is the level of expression of a gene in a 
standard situation, such as a standard Ta tumor or a standard T2 tumor. For use in the pre- 
sent invention standard expression levels is determined for each stage as well as for each 
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group of progression, recurrence, and other prognostic indices. It is then possible to com- 
pare the result of a determination of the expression level from a gene of a given biological 
condition with a standard for each stage, progression, recurrence and other indices to obtain 
a classification of the biological condition. 

5 

Furthermore, in the present context a reference patterne refers to the pattern of expression 
levels seen in standard situations as discussed above, and reference patterns may be used 
as discussed above for standard expression levels. 

10 It is known from the histopathologic^ classification of bladder tumors that some information 
is obtained from merely classifying into stage and grade of tumor. Accordingly, in one as- 
pect, the invention relates to a method of predicting the prognosis of the biological condition 
by determining the stage of the biological condition, by determining an expression level of at 
least one gene, wherein said gene is selected from the group of genes consisting of gene No 

15 1 to gene No. 562. In this aspect information about the stage reveils directly information 
about the prognosis as well. An example hereof is when a bladder tumor is classified as for 
example stage T2, then the prognosis for the bladder tumor is obtained directly from the 
prognosis related generally to stage T2 tumors. In a preferred embodiment the genes for 
predicting the prognosis by establishing the stage of the tumor may be selected from gene 

20 selected from the group of genes consisting of gene No. 1 to gene No. 188. More preferably 
the genes for predicting the prognosis by establishing the stage of the tumor may be se- 
lected from gene selected from the group of genes consisting of gene Nos. 18, 39, 40, 55, 
58, 79, 86, 87, 88, 91, 93, 103, 105, 106, 121, 123, 125, 126. 136, 137, 140, 149, 156, 158, 
161, 165. 166, 167, 175, 184, 187, 188. 

25 

It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
30 least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 15 genes, such as the expression level of at least 20 genes, 
such as the expression levels of at least 25 genes, such as the expression levels of at least 
30 genes, such as the expression level of 32 genes. 



35 



As discussed above, in relation to bladder cancer the stages of a bladder tumor are selected 
from bladder cancer stages Ta, Carcinoma in situ, T1. T2, T3 and T4. In an embodiment the 
determination of a stage comprises 
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assaying at least the expression of Ta stage gene from a Ta stage gene group, at feast one 
expression of a CIS gene, at least one expression of T1 stage gene from a T1 stage gene 
group, at least the expression of T2 stage gene from a T2 stage gene group, and more pref- 
erably assaying at least the expression of Ta stage gene from a Ta stage gene group, at 
5 least one expression of a CIS gene, at least one expression of T1 stage gene from a T1 
stage gene group, at least the expression of T2 stage gene from a T2 stage gene group, at 
least the expression of T3 stage gene from a T3 stage gene group, at least the expression of 
T4 stage gene from a T4 stage gene group wherein at least one gene from each gene group 
is expressed in a significantly different amount in that stage than in one of the other stages. 

10 

Preferably, the genes selected may be a gene from each gene group being expressed in a 
significantly higher amount in that stage than in one of the other stages as compared to nor- 
mal controls, see for example Table B below. 

1 5 The genes selected may be a gene from each gene group being expressed in a significantly 
lower amount in that stage than in one of the other stages. 

In another embodiment the present invention relates to a method of predicting the prognosis 
of a biological condition by obtaining information in addition to the stage classification as 
20 such. As described above, by determining gene expression levels that signify a T2 stage in a 
tumor otherwise classified as a Ta tumor, the expression levels signal that the Ta tumor has 
the potential to become a T2 tumor. The opposite is also true, that an expression level that 
signify Ta will be favorable to have in a T2 tumor. In the present invention the inventors have 
shown that some genes are relevant for obtaining this additional information. 

25 

Also, in one embodiment the present invention relates to a further method of predicting the 
prognosis of a biological condition by obtaining information in addition to the stage classifica- 
tion as such. Determination of squamous metaplasia in a tumor, in particular in a T2 stage 
tumor, is indicative of risk of progression. In particular the genes may be selected from gene 
30 selected from the group of genes consisting of gene No. 215 to gene No. 232, see also table 
H. 

It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
35 the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 15 genes, such as the expression level of 18 genes. 
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In another embodiment the invention relates to genes bearing information of recurrence of 
the biological condition as such. In particular the genes may be selected from gene selected 
from the group of genes consisting of gene No. 189 to gene No. 214. It is preferred to deter- 
5 mine a first expression level of at least one gene from a first gene group, wherein the gene 
from the first gene group is selected from the group of genes wherein expression is in- 
creased in case of recurrence, genes No. 189 to gene No. 199 (recurrence genes), and de- 
termined a second expression level of at least one gene from a second gene group, wherein 
the second gene group is selected from the group of genes wherein expression is increased 
10 in case of no recurrence, genes No. 200 to No. 214 (non-recurrence genes), and correlate 
the first expression level to a standard expression level for progressors. and/or the second 
expression level to a standard expression level for non-progressors to predict the prognosis 
of the biological condition in the animal tissue, see also table C. 

15 It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 

20 sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 15 genes, such as the expression level of at least 20 genes, 
such as the expression level of at least 25 genes, such as the expression level of 26 genes. 

Furthermore, in another embodiment the invention relates to genes bearing information of 
25 progression as such. In particular the genes may be selected from the group of genes of 

table D, more preferably selected from the group of genes consisting of gene No. 233 to 

gene No. 446. More preferably the genes may be selected from the group of genes Nos. 

255. 273. 279. 280, 281. 282 . 287. 295. 300. 311. 317. 320. 333. 346. 347. 349. 352. 364. 

365. 373. 383. 386, 390. 394. 401 .407. 414. 417. 426. 427. 428. 433. 434, 435, 436. 437. 
30 438. 439. 440. 441 . 442. 443. 444. 445, 446, see table E. 

It is preferred that the expresison level of more one gene is determined, such as the expres- 
sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
35 genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
the expression level of at least 15 genes, such as the expression level of at least 20 genes, 
such as the expression levels of at least 25 genes, such as the expression levels of at least 
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30 genes, such as the expression level of at least 35 genes, such as the expression level of 
at least 40 genes, such as the expression level of 45 genes. 

Furthermore, it is within the scope of the invention to predict the prognosis of a biological 
5 condition in animal tissue by determining the expression level of at least two genes, by 



determining a first expression level of at least one gene from a first gene group, wherein 
the gene from the first gene group is selected from the group of gene Nos. 237, 238, 
239. 240, 241, 242, 243, 245, 246, 247, 248, 250, 253, 254, 257, 258. 260, 263, 264, 

10 265, 267, 270, 271, 272, 278, 283, 284, 287, 288, 290, 291, 292, 294. 297, 298. 300, 

302, 303, 305, 309, 310, 315, 316, 317, 318, 319, 321, 324, 329, 335, 336, 337, 339, 
340, 344, 346, 347, 354, 356, 358, 359, 362, 364, 365, 368, 369, 371, 372, 377, 378, 
379, 380, 381, 382, 383, 384, 388, 391, 393, 395, 396, 397, 399, 402, 403, 404, 409, 
413, 417, 419, 420, 421, 422, 423, 425, 427 ,429, 430, 431, 432, 437, 444 (progressor 

15 genes), and 



determining a second expression level of at least one gene from a second gene group, 
wherein the second gene group is selected from the group of genes Nos. 233, 234, 235, 
236, 244, 249, 251, 252, 255, 256, 259, 261, 262, 266, 268, 269, 273, 274, 275, 276, 

20 277, 279, 280, 281, 282, 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 307, 308, 

311, 312, 313, 314 , 320 , 322, 323, 325, 326, 327, 328 , 330, 331, 332, 333, 334, 338, 
341, 342, 343, 345, 348, 349, 350, 351. 352, 353, 355, 357, 360, 361, 363, 366, 367, 
370, 373, 374, 375, 376, 385, 386. 387, 389, 390, 392, 394, 398, 400, 401, 405, 406, 
407, 408, 410, 411, 412, 414, 415, 416. 418, 424, 426, 428, 433, 434, 435, 436, 438, 

25 439, 440, 441 , 442, 443, 445, 446 (non-progressor genes), and 

correlating the first expression level to a standard expression level for progressors, 
and/or the second expression level to a standard expression level for non-progressors to 
predict the prognosis of the biological condition in the animal tissue. 

30 

In particular the genes of the first group and the second group for predicting the prognosis of 
a Ta stage tumor may be selected from gene selected from the group of progression/non- 
progession genes described above. 

35 In yet another embodiment the present invention offers the possibility to predict the presence 
or absence of Carcinoma in situ in the same organ as the primary biological condition. An 
example hereof is for a Ta bladder tumor to predict, whether the bladder in addition to the Ta 
tumor comprises Carcinoma in situ (CIS). The presence of carcinoma in situ in a bladder 
containing a superficial Ta tumor is a signal that the Ta tumor has the potential of recurrence 
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and invasiveness. Accordingly, by predicting the presence of carcinoma in situ important 
information about the prognosis is obtained. In the present context, genes for predicting the 
presence of carcinoma in situ for a Ta stage tumor may be selected from gene selected from 
the group of genes consisting of gene No. 447 to gene No. 562. More preferably the genes 
are selected from the group of genes consisting of gene Nos 447, 448, 449, 450, 451, 452, 
453, 454, 455, 456. 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469. 470, 
471, 472. 473, 474, 475, 476. 477, 478. 479, 480 .481. 482. 483 ,484, 485, 486, 487, 488. 
489. 490, 491. 492. 493, 494, 495 , 496, 497, 498, 499, 500. 501, 502, 503, 504. 505. 506, 
507, 508. 509. 510. 511, 512, 513. 514. 515. 516. 517 .518 .519, 520, 521. 522 ,523, 524, 
525. 526. 527, 528, 529, 530. 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542. 
543, 544, 545, 546, see table F, or from the group of genes consisting of gene Nos. 547, 
548, 549. 550. 551. 552. 553, 554. 555. 556. 557, 558, 559, 560, 561, 562, see table G. 



It is preferred that the expresison level of more one gene is determined, such as the expres- 
1 5 sion level of at least two genes, such as the expression level of at least three genes, such as 
the expression level of at least four genes, such as the expression level of at least five 
genes, such as the expression level of at least six genes, such as the expression level of at 
least seven genes, such as the expression level of at least eight genes, such as the expres- 
sion level of at least nine genes, such as the expression level of at least ten genes, such as 
20 the expression level of at least 15 genes, such as the expression level of at least 20 genes, 
such as the expression levels of at least 25 genes, such as the expression levels of at least 
30 genes, such as the expression level of at least 35 genes, such as the expression level of 
at least 40 genes, such as the expression level of at least 45 genes, such as the expression 
level of at least 50 genes, such as 100 genes. In another embodiment the expression level of 
25 16 genes are determined. 

It is also preferred to determine a first expression level of at least one gene from a first gene 
group, wherein the gene from the first gene group is selected from the group of genes 
wherein expression is increased in case of CIS, genes Nos. 447. 448. 449, 450. 451. 452, 

30 454, 455 ,456, 457, 458. 459. 462. 468. 474. 478, 484, 489, 491, 493, 495, 500, 501, 502, 
504. 505. 506. 507. 508. 509. 510, 511, 512, 513, 514, 518 , 519, 520. 522. 523. 524, 525, 
529, 531, 534, 535, 536, 538. 544, 546. 547. 548. 549, 550, 551, 552, 553, 555, 556, 558. 
559, 561. 562 (CIS genes), and determined a second expression level of at least one gene 
from a second gene group, wherein the second gene group is selected from the group of 

35 genes wherein expression is increased in case of no CIS. genes Nos. 453. 460, 461, 463, 
464, 465, 466. 467, 469, 470. 471. 472. 473. 475. 476. 477. 479, 480, 481, 482, 483, 485, 
486, 487. 488. 490. 492, 494, 496. 497. 498 . 499, 503, 515, 516. 517, 521, 526. 527. 528, 
530 .532. 533. 537. 539. 540. 541. 542. 543. 545. 554, 557. 560 (non-CIS genes), and corre- 
late the first expression level to a standard expression level for CIS, and/or the second ex- 
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pression level to a standard expression level for non-CIS to predict the prognosis of the bio- 
logical condition in the animal tissue. 

It is preferred when determining the expression level of at least one gene from a first group 
5 and at least one gene from a second group that the expression level of more than one genes 
from each group is determined. Thus, it is preferred that the expresison level of more one 
gene is determined, such as the expression level of at least two genes, such as the expres- 
sion level of at least three genes, such as the expression level of at least four genes, such as 
the expression level of at least five genes, such as the expression level of at least six genes, 
10 such as the expression level of at least seven genes, such as the expression level of at least 
eight genes, such as the expression level of at least nine genes, such as the expression 
level of at least ten genes in each group. 

In one embodiment of the invention the stage of the biological condition has been deter- 
15 mined before the prediction of prognosis. The stage may be determined by any suitable 
means such as determined by histological examination of the tissue or by genotyping of the 
tissue, preferably by genotyping of the tissue such as described herein or as described in 
WO 02/02804 incorporated herein by reference. 

20 In another aspect the invention relates to a method of determining the stage of a biological 
condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

25 determining an expression level of at least one gene selected from the group of genes 

consisting of gene No. 1 to gene No. 562, 

correlating the expression level of the assessed genes to at least one standard level of 
expression determining the stage of the condition. 

30 

In particular the expression level of at least one gene selected from the group of genes con- 
sisting of gene Nos. 1-457 and gene Nos. 459-535 and gene Nos. 537-562. 

Specific embodiments of determining the stage is as described above for predicting progno- 
35 sis by determination of stage. 

In a preferred embodiment the expression level of at least two genes is determined by 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




CT/DK2003/000750 



36 

determining the expression of at least a first stage gene from a first stage gene group 
and at least a second stage gene from a second stage gene group, wherein at least one 
of said genes is expressed in said first stage of the condition in a higher amount than in 
said second stage, and the other gene is a expressed in said first stage of the condition 
5 in a lower amount than in said second stage of the condition, and 



correlating the expression level of the assessed genes to a standard level of expression 
determining the stage of the condition. 



10 In general, genes being downregulated for higher stage tumors as well as for progression 
and recurrence may be of importance as predictive markers for the disease as loss of one or 
more of these may signal a poor outcome or an aggressive disease course. Furthermore, 
they may be important targets for therapy as restoring their expression level, e.g. by gene 
therapy, or substitution with those peptide products or small molecules with a similar biologi- 

1 5 cal effect may suppress the malignant growth. 



Genes that are up-regulated (or gained de novo) during the malignant progression of bladder 
cancer from normal tissue through Ta, T1, T2, T3 and T4 is also within the scope of the in- 
vention. These genes are potential oncogenes and may be those genes that create or en- 
20 hance the malignant growth of the cells. The expression level of these genes may serve as 
predictive markers for the disease course and treatment response, as a high level may sig- 
nal an aggressive disease course, and they may serve as targets for therapy, as blocking 
these genes by e.g. anti-sense therapy, or by biochemical means could inhibit, or slow the 
tumor growth. 

25 

The genes used according to the invention show a sufficient difference in expression from 
one group to another and/or from one stage to another to use the gene as a classifier for the 
group and/or stage. Thus, comparison of an expression pattern to another may score a 
change from expressed to non-expressed, or the reverse. Alternatively, changes in intensity 
30 of expression may be scored, either increases or decreases. Any significant change can be 
used. Typical changes which are more than 2-fold are suitable. Changes which are greater 
than 5-fold are highly suitable. 



The present invention in particular relates to methods using genes wherein at least a two- 
fold change in expression, such as at least a three-fold change, for example at least a four 
fold change, such as at least a five fold change, for example at least a six fold change, such 
as at least a ten fold change, for example at least a fifteen fold change, such as at least a 
twenty fold change is seen between two groups. 
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As described above the invention relates to the use of information of expression levels. In 
one embodiment the expression patterns is obtained, thus, the invention relates to a method 
of determining an expression pattern of a bladder cell sample, comprising: 

collecting sample comprising bladder cells and/or expression products from bladder 
cells, 



determining the expression level of at least one gene in the sample, said gene being se- 
lected from the group of genes consisting of gene No. 1 to gene No. 562, and obtaining 
10 an expression pattern of the bladder cell sample. 

The invention preferably include more than one gene in the pattern, according it is preferred 
to include the expression level of at least two genes, such as the expression level of at least 
three genes, such as the expression level of at least four genes, such as the expression 
1 5 level of at least five genes, such as the expression level of more than six genes. 

The expression pattern preferably relates to one or more of the group of genes discussed 
above with respect to prognosis relating to stage, SSC. progression, recurrence and/or CIS. 

20 In order to predict prognosis and/or stages it is preferred to determine an expression pattern 
of a cell sample preferably independent of the proportion of submucosal, muscle and 
connective tissue cells present. Expression is determined of one or more genes in a sample 
comprising cells, said genes being selected from the same genes as discussed above and 
shown in the tables. 

25 

It is an object of the present invention that characteristic patterns of expression of genes can 
be used to characterize different types of tissue. Thus, for example gene expression patterns 
can be used to characterize stages and grades of bladder tumors. Similarly, gene expression 
patterns can be used to distinguish cells having a bladder origin from other cells. Moreover. 
30 gene expression of cells which routinely contaminate bladder tumor biopsies has been 
identified, and such gene expression can be removed or subtracted from patterns obtained 
from bladder biopsies. Further, the gene expression patterns of single-cell solutions of 
bladder tumor cells have been found to be substantially without interfering expression of 
contaminating muscle, submucosal, and connective tissue cells than biopsy samples. 



35 



The one or more genes exclude genes which are expressed in the submucosal, muscle, and 
connective tissue. A pattern of expression is formed for the sample which is independent of 
the proportion of submucosal, muscle, and connective tissue cells in the sample. 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




•CT/DK2003/000750 



38 

In another aspect of the invention a method of determining an expression pattern of a cell 
sample is provided. Expression is determined of one or more genes in a sample comprising 
ceils. A first pattern of expression is thereby formed for the sample. Genes which are 
expressed in submucosal, muscle, and connective tissue cells are removed from the first 
pattern of expression, forming a second pattern of expression which is independent of the 
proportion of submucosal, muscle, and connective tissue cells in the sample. 



Another embodiment of the invention provides a method for determining an expression 
pattern of a bladder mucosa or bladder cancer cell. Expression is determined of one or more 

10 genes in a sample comprising bladder mucosa or bladder cancer cells; the expression 
determined forms a first pattern of expression. A second pattern of expression which was 
formed using the one or more genes and a sample comprising predominantly submucosal, 
muscle, and connective tissue cells, is subtracted from the first pattern of expression, 
forming a third pattern of expression. The third pattern of expression reflects expression of 

15 the bladder mucosa or bladder cancer cells independent of the proportion of submucosal, 
muscle, and connective tissue cells present in the sample. 

In one embodiment the invention provides a method to predict the prognosis of a bladder 
tumor as described above. A first pattern of expression is determined of one or more genes 
20 in a bladder tumor sample. The first pattern is compared to one or more reference patterns 
of expression determined for bladder tumors at different stages and/or in different groups. 
The reference pattern which shares maximum similarity with the first pattern is identified. The 
stage of the reference pattern with the maximum similarity is assigned to the bladder tumor 
sample. 

25 

Yet another embodiment the invention provides a method to determine the stage of a 
bladder tumor as described above. A first pattern of expression is determined of one or more 
genes in a bladder tumor sample. The first pattern is compared to one or more reference 
patterns of expression determined for bladder tumors at different stages. The reference 
30 pattern which shares maximum similarity with the first pattern is identified. The stage of the 
reference pattern with the maximum similarity is assigned to the bladder tumor sample. 

Since a biopsy of the tissue often contains more tissue material such as connective tissue 
than the tissue to be examined, when the tissue to be examined is epithelial or mucosa, the 
35 invention also relates to methods, wherein the expression pattern of the tissue is 
independent of the amount of connective tissue in the sample. 

Biopsies contain epithelial cells that most often are the targets for the studies, and in addition 
many other ceils that contaminate the epithelial cell fraction to a varying extent. The 
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contaminants include histiocytes, endothelial cells, leukocytes, nerve cells, muscle cells etc. 
Micro dissection is the method of choice for DNA examination, but in the case of expression 
studies this procedure is difficult due to RNA degradation during the procedure. The epithelium 
may be gently removed and the expression in the remaining submucosa and underlying 
5 connective tissue (the bladder wall) monitored. Genes expressed at high or low levels in the 
bladder wall should be interrogated when performing expression monitoring of the mucosa and 
tumors. A similar approach could be used for studies of epithelia in other organs. 

In one embodiment of the invention normal mucosa lining the bladder lumen from bladders for 
10 cancer is scraped off. Then biopsies is taken from the denuded submucosa and connective 
tissue, reaching approximately 5 mm into the bladder wall, and immediately disintegrated in 
guanidinium isothiocyanate. Total RNA may be extracted, pooled, and poly(A)* mRNA may be 
prepared from the pool followed by conversion to double-stranded cDNA and in vitro 
transcription into cRNA containing biotin-labeled CTP and UTP. 

15 

Genes that are expressed and genes that are not expressed in bladder wall can both interfere 
with the interpretation of the expression in a biopsy, and should be considered when 
interpreting expression intensities in tumor biopsies, as the bladder wall component of a biopsy 
varies in amount from biopsy to biopsy. 

20 

When having determined the pattern of genes expressed in bladder wall components said 
pattern may be subtracted from a pattern obtained from the sample resulting in a third pattern 
related to the mucosa (epithelial) cells. 

25 In another embodiment of the invention a method is provided for determining an expression 
pattern of a bladder tissue sample independent of the proportion of submucosal, muscle and 
connective tissue cells present. A single-cell suspension of disaggregated bladder tumor 
cells is isolated from a bladder tissue sample comprising bladder tumor cells is isolated from 
a bladder tissue sample comprising bladder cells, submucosal cells, muscle cells, and 

30 connective tissue cells. A pattern of expression is thus formed for the sample which is 
independent of the proportion of submucosal, muscle, and connective tissue cells in the 
bladder tissue sample. 

Yet another method relates to the elimination of mRNA from bladder wall components before 
35 determining the pattern, e.g. by filtration and/or affinity chromatography to remove mRNA 
related to the bladder wall. 

Working with tumor material requires biopsies or body fluids suspected to comprise relevant 
cells. Working with RNA requires freshly frozen or immediately processed biopsies, or 
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chemical pretreatment of the biopsy. Apart from the cancer tissue, biopsies do inevitably 
contain many different cell types, such as cells present in the blood, connective and muscle 
tissue, endothelium etc. In the case of DNA studies, microdissection or laser capture are 
methods of choice, however the time-dependent degradation of RNA makes it difficult to 
5 perform manipulation of the tissue for more than a few minutes. Furthermore, studies of 
expressed sequences may be difficult on the few cells obtained via microdissection or laser 
capture, as these cells may have an expression pattern that deviates from the predominant 
pattern in a tumor due to large intratumoral heterogeneity. 



10 In the present context high density expression arrays may be used to evaluate the impact of 
bladder wall components in bladder tumor biopsies, and tested preparation of single cell 
solutions as a means of eliminating the contaminants. The results of these evaluations 
permit for the design of methods of evaluating bladder samples without the interfering 
background noise caused by ubiquitous contaminating submucosal, muscle, and connective 

15 tissue cells. The evaluating assays of the invention may be of any type. 

While high density expression arrays can be used, other techniques are also contemplated. 
These include other techniques for assaying for specific mRNA species, including RT-PCR 
and Northern Blotting, as well as techniques for assaying for particular protein products, 
20 such as ELISA, Western blotting, and enzyme assays. Gene expression patterns according 
to the present invention are determined by measuring any gene product of a particular gene, 
including mRNA and protein. A pattern may be for one or more genes. 

RNA or protein can be isolated and assayed from a test sample using any techniques known 
25 in the art. They can for example be isolated from a fresh or frozen biopsy, from formalin-fixed 
tissue, from body fluids, such as blood, plasma, serum, urine, or sputum. 

Expression of genes may in general be detected by either detecting mRNA from the cells 
and/or detecting expression products, such as peptides and proteins. 

30 

The detection of mRNA of the invention may be a tool for determining the developmental 
stage of a cell type which may be definable by its pattern of expression of messenger RNA. 
For example, in particular stages of cells, high levels of ribosomal RNA are found whereas 
relatively low levels of other types of messenger RNAs may be found. Where a pattern is 
35 shown to be characteristic of a stage, said stage may be defined by that particular pattern of 
messenger RNA expression. The mRNA population is a good determinant of a 
developmental stage, and may be correlated with other structural features of the cell. In this 
manner, cells at specific developmental stages will be characterized by the intracellular 
environment, as well as the extracellular environment. The present invention also allows the 
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combination of definitions based in part upon antigens and in part upon mRNA expression. 
In one embodiment, the two may be combined in a single incubation step. A particular 
incubation condition may be found which is compatible with both hybridization recognition 
and non-hybridization recognition molecules. Thus, e.g. an incubation condition may be 
5 selected which allows both specificity of antibody binding and specificity of nucleic acid 
hybridization. This allows simultaneous performance of both types of interactions on a single 
matrix. Again, where developmental mRNA patterns are correlated with structural features, 
or with probes which are able to hybridize to intracellular mRNA populations, a cell sorter 
may be used to sort specifically those cells having desired mRNA population patterns. 

10 

It is within the general scope of the present invention to provide methods for the detection of 
mRNA. Such methods often involve sample extraction, PGR amplification, nucleic acid 
fragmentation and labeling, extension reactions, and transcription reactions. 

15 The nucleic acid (either genomic DNA or mRNA) may be isolated from the sample according 
to any of a number of methods well known to those of skill in the art. One of skill will 
appreciate that where alterations in the copy number of a gene are to be detected genomic 
DNA is preferably isolated. Conversely, where expression levels of a gene or genes are to 
be detected, preferably RNA (mRNA) is isolated. 

20 

Methods of isolating total mRNA are well known to those of skill in the art. In one 
embodiment, the total nucleic acid is isolated from a given sample using, for example, an 
acid guanidinium-phenol-chloroform extraction method and polyA.sup. and mRNA is isolated 
by oligo dT column chromatography or by using (dT)n magnetic beads (see, e.g., Sambrook 
25 et al., Molecular Cloning: A Laboratory Manual (2nd ed.) f Vols. 1-3, Cold Spring Harbor 
Laboratory, (1989), or Current Protocols in Molecular Biology, F. Ausubel et al., ed. Greene 
Publishing and Wiley-lnterscience, New York (1987)). 

The sample may be from tissue and/or body fluids, as defined elsewhere herein. Before 
30 analyzing the sample, e.g., on an oligonucleotide array, it will often be desirable to perform 
one or more sample preparation operations upon the sample. Typically, these sample 
preparation operations will include such manipulations as extraction of intracellular material, 
e.g., nucleic acids from whole cell samples, viruses, amplification of nucleic acids, 
fragmentation, transcription, labeling and/or extension reactions. One or more of these 
35 various operations may be readily incorporated into the device of the present invention. 

DNA extraction may be relevant under circumstances where possible mutations in the genes 
are to be determined in addition to the determination of expression of the genes. 
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For those embodiments where whole cells, or other tissue samples are being analyzed, it will 
typically be necessary to extract the nucleic acids from the cells or viruses, prior to 
continuing with the various sample preparation operations. Accordingly, following sample 
collection, nucleic acids may be liberated from the collected cells, viral coat etc. into a crude 
extract followed by additional treatments to prepare the sample for subsequent operations, 
such as denaturation of contaminating (DNA binding) proteins, purification, filtration and 
desalting. 



10 Liberation of nucleic acids from the sample cells, and denaturation of DNA binding proteins 
may generally be performed by physical or chemical methods. For example, chemical 
methods generally employ lysing agents to disrupt the cells and extract the nucleic acids 
from the cells, followed by treatment of the extract with chaotropic salts such as guanidinium 
isothiocyanate or urea to denature any contaminating and potentially interfering proteins. 



15 



20 



Alternatively, physical methods may be used to extract the nucleic acids and denature DNA 
binding proteins, such as physical protrusions within microchannels or sharp edged particles 
piercing cell membranes and extract their contents. Combinations of such structures with 
piezoelectric elements for agitation can provide suitable shear forces for lysis. 



More traditional methods of cell extraction may also be used, e.g., employing a channel with 
restricted cross-sectional dimension which causes cell lysis when the sample is passed 
through the channel with sufficient flow pressure. Alternatively, cell extraction and denaturing 
of contaminating proteins may be carried out by applying an alternating electrical current to 
25 the sample. More specifically, the sample of cells is flowed through a microtubular array 
while an alternating electric current is applied across the fluid flow. Subjecting cells to 
ultrasonic agitation, or forcing cells through microgeometry apertures, thereby subjecting the 
cells to high shear stress resulting in rupture are also possible extraction methods. 

30 Following extraction, it will often be desirable to separate the nucleic acids from other 
elements of the crude extract, e.g. denatured proteins, cell membrane particles and salts. 
Removal of particulate matter is generally accomplished by filtration or flocculation. Further, 
where chemical denaturing methods are used, it may be desirable to desalt the sample prior 
to proceeding to the next step. Desalting of the sample and isolation of the nucleic acid may 

35 generally be carried out in a single step, e.g. by binding the nucleic acids to a solid phase 
and washing away the contaminating salts, or performing gel filtration chromatography on 
the sample passing salts through dialysis membranes. Suitable solid supports for nucleic 
acid binding include e.g. diatomaceous earth or silica (i.e., glass wool). Suitable gel 
exclusion media also well known in the art may be readily incorporated into the devices of 
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the present invention and is commercially available from, e.g., Pharmacia and Sigma 
Chemical. 



Alternatively, desalting methods may generally take advantage of the high electrophoretic 
5 mobility and negativity of DNA compared to other elements. Electrophoretic methods may 
also be utilized in the purification of nucleic acids from other cell contaminants and debris. 
Upon application of an appropriate electric field, the nucleic acids present in the sample will 
migrate toward the positive electrode and become trapped on the capture membrane. 
Sample impurities remaining free of the membrane are then washed away by applying an 
10 appropriate fluid flow. Upon reversal of the voltage, the nucleic acids are released from the 
membrane in a substantially purer form. Further, coarse filters may also be overlaid on the 
barriers to avoid any fouling of the barriers by particulate matter, proteins or nucleic acids, 
thereby permitting repeated use. 

15 In a similar aspect, the high electrophoretic mobility of nucleic acids with their negative 
charges, may be utilized to separate nucleic acids from contaminants by utilizing a short 
column of a gel or other appropriate matrices or gels which will slow or retard the flow of 
other contaminants while allowing the faster nucleic acids to pass. 

20 This invention provides nucleic acid affinity matrices that bear a large number of different 
nucleic acid affinity ligands allowing the simultaneous selection and removal of a large 
number of preselected nucleic acids from the sample. Methods of producing such affinity 
matrices are also provided. In general the methods involve the steps of a) providing a nucleic 
acid amplification template array comprising a surface to which are attached at least 50 

25 oligonucleotides having different nucleic acid sequences, and wherein each different 
oligonucleotide is localized in a predetermined region of said surface, the density of said 
oligonucleotides is greater than about 60 different oligonucleotides per 1 cm.sup.2, and all of 
said different oligonucleotides have an identical terminal 3* nucleic acid sequence and an 
identical terminal 5' nucleic acid sequence, b) amplifying said multiplicity of oligonucleotides 

30 to provide a pool of amplified nucleic acids; and c) attaching the pool of nucleic acids to a 
solid support. 

For example, nucleic acid affinity chromatography is based on the tendency of 
complementary, single-stranded nucleic acids to form a double-stranded or duplex structure 
35 through complementary base pairing. A nucleic acid (either DNA or RNA) can easily be 
attached to a solid substrate (matrix) where it acts as an immobilized ligand that interacts 
with and forms duplexes with complementary nucleic acids present in a solution contacted to 
the immobilized ligand. Unbound components can be washed away from the bound complex 
to either provide a solution lacking the target molecules bound to the affinity column, or to 
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provide the isolated target molecules themselves. The nucleic acids captured in a hybrid 
duplex can be separated and released from the affinity matrix by denaturation either through 
heat, adjustment of salt concentration, or the use of a destabilizing agent such as 
formamide, TWEENTM.-20 denaturing agent, or sodium dodecyl sulfate (SDS). 

5 

Affinity columns (matrices) are typically used either to isolate a single nucleic acid typically 
by providing a single species of affinity ligand. Alternatively, affinity columns bearing a single 
affinity ligand (e.g. oligo dt columns) have been used to isolate a multiplicity of nucleic acids 
where the nucleic acids all share a common sequence (e.g. a polyA). 

10 

The type of affinity matrix used depends on the purpose of the analysis. For example, where 
it is desired to analyze mRNA expression levels of particular genes in a complex nucleic acid 
sample (e.g., total mRNA) it is often desirable to eliminate nucleic acids produced by genes 
that are constitutively overexpressed and thereby tend to mask gene products expressed at 

15 characteristically lower levels. Thus, in one embodiment, the affinity matrix can be used to 
remove a number of preselected gene products (e.g., actin, GAPDH, etc.). This is 
accomplished by providing an affinity matrix bearing nucleic acid affinity ligands 
complementary to the gene products (e.g., mRNAs or nucleic acids derived therefrom) or to 
subsequences thereof. Hybridization of the nucleic acid sample to the affinity matrix will 

20 result in duplex formation between the affinity ligands and their target nucleic acids. Upon 
elution of the sample from the affinity matrix, the matrix will retain the duplexes nucleic acids 
leaving a sample depleted of the overexpressed target nucleic acids. 



The affinity matrix can also be used to identify unknown mRNAs or cDNAs in a sample. 
Where the affinity matrix contains nucleic acids complementary to every known gene (e.g., in 
a cDNA library, DNA reverse transcribed from an mRNA, mRNA used directly or amplified, 
or polymerized from a DNA template) in a sample, capture of the known nucleic acids by the 
affinity matrix leaves a sample enriched for those nucleic acid sequences that are unknown. 
In effect, the affinity matrix is used to perform a subtractive hybridization to isolate unknown 
nucleic acid sequences. The remaining "unknown" sequences can then be purified and 
sequenced according to standard methods. 



The affinity matrix can also be used to capture (isolate) and thereby purify unknown nucleic 
acid sequences. For example, an affinity matrix can be prepared that contains nucleic acid 
(affinity ligands) that are complementary to sequences not previously identified, or not 
previously known to be expressed in a particular nucleic acid sample. The sample is then 
hybridized to the affinity matrix and those sequences that are retained on the affinity matrix 
are "unknown" nucleic acids. The retained nucleic acids can be eluted from the matrix (e.g. 
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at increased temperature, increased destabilizing agent concentration, or decreased salt) 
and the nucleic acids can then be sequenced according to standard methods. 



Similarly, the affinity matrix can be used to efficiently capture (isolate) a number of known 
5 nucleic acid sequences. Again, the matrix is prepared bearing nucleic acids complementary 
to those nucleic acids it is desired to isolate. The sample is contacted to the matrix under 
conditions where the complementary nucleic acid sequences hybridize to the affinity ligands 
in the matrix. The non-hybridized material is washed off the matrix leaving the desired 
sequences bound. The hybrid duplexes are then denatured providing a pool of the isolated 
10 nucleic acids. The different nucleic acids in the pool can be subsequently separated 
according to standard methods (e.g. gel electrophoresis). 

As indicated above the affinity matrices can be used to selectively remove nucleic acids from 
virtually any sample containing nucleic acids (e.g. in a cDNA library, DNA reverse 
15 transcribed from an mRNA, mRNA used directly or amplified, or polymerized from a DNA 
template, and so forth). The nucleic acids adhering to the column can be removed by 
washing with a low salt concentration buffer, a buffer containing a destabilizing agent such 
as formamide, or by elevating the column temperature. 



20 In one particularly preferred embodiment, the affinity matrix can be used in a method to 
enrich a sample for unknown RNA sequences (e.g. expressed sequence tags (ESTs)). The 
method involves first providing an affinity matrix bearing a library of oligonucleotide probes 
specific to known RNA (e.g., EST) sequences. Then, RNA from undifferentiated and/or 
unactivated cells and RNA from differentiated or activated or pathological (e.g., transformed) 

25 or otherwise having a different metabolic state are separately hybridized against the affinity 
matrices to provide two pools of RNAs lacking the known RNA sequences. 



In a preferred embodiment, the affinity matrix is packed into a columnar casing. The sample 
is then applied to the affinity matrix (e.g. injected onto a column or applied to a column by a 
30 pump such as a sampling pump driven by an autosampler). The affinity matrix (e.g. affinity 
column) bearing the sample is subjected to conditions under which the nucleic acid probes 
comprising the affinity matrix hybridize specifically with complementary target nucleic acids. 
Such conditions are accomplished by maintaining appropriate pH, salt and temperature 
conditions to facilitate hybridization as discussed above. 

35 

For a number of applications, it may be desirable to extract and separate messenger RNA 
from ceils, cellular debris, and other contaminants. As such, the device of the present 
invention may, in some cases, include a mRNA purification chamber or channel. In general, 
such purification takes advantage of the poly-A tails on mRNA. In particular and as noted 
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above, poly- T oligonucleotides may be immobilized within a chamber or channel of the 
device to serve as affinity ligands for mRNA. Poly-T oligonucleotides may be immobilized 
upon a solid support incorporated within the chamber or channel, or alternatively, may be 
immobilized upon the surface(s) of the chamber or channel itself. Immobilization of 
5 oligonucleotides on the surface of the chambers or channels may be carried out by methods 
described herein including, e.g., oxidation and silanation of the surface followed by standard 
DMT synthesis of the oligonucleotides. 



In operation, the lysed sample is introduced to a high salt solution to increase the ionic 
10 strength for hybridization, whereupon the mRNA will hybridize to the immobilized poly-T. The 
mRNA bound to the immobilized poly-T oligonucleotides is then washed free in a low ionic 
strength buffer. The poy-T oligonucleotides may be immobiliized upon poroussurfaces, e.g., 
porous silicon, zeolites silica xerogels, scintered particles, or other solid supports. 



1 5 Following sample preparation, the sample can be subjected to one or more different analysis 
operations. A variety of analysis operations may generally be performed, including size 
based analysis using, e.g., microcapillary electrophoresis, and/or sequence based analysis 
using, e.g., hybridization to an oligonucleotide array. 



20 In the latter case, the nucleic acid sample may be probed using an array of oligonucleotide 
probes. Oligonucleotide arrays generally include a substrate having a large number of 
positionally distinct oligonucleotide probes attached to the substrate. These arrays may be 
produced using mechanical or light directed synthesis methods which incorporate a 
combination of photolithographic methods and solid phase oligonucleotide synthesis 

25 methods. 

The basic strategy for light directed synthesis of oligonucleotide arrays is as follows. The 
surface of a solid support, modified with photosensitive protecting groups is illuminated 
through a photolithographic mask, yielding reactive hydroxyl groups in the illuminated 

30 regions. A selected nucleotide, typically in the form of a 3-O-phosphoramidite-activated 
deoxynucleoside (protected at the 5' hydroxyl with a photosensitive protecting group), is then 
presented to the surface and coupling occurs at the sites that were exposed to light. 
Following capping and oxidation, the substrate is rinsed and the surface is illuminated 
through a second mask to expose additional hydroxyl groups for coupling. A second selected 

35 nucleotide (e.g., 5-protected, 3-O-phosphoramidite-activated deoxynucleoside) is presented 
to the surface. The selective deprotection and coupling cycles are repeated until the desired 
set of products is obtained. Since photolithography is used the process can be readily 
miniaturized to generate high density arrays of oligonucleotide probes. Furthermore, the 
sequence of the oligonucleotides at each site is known. See Pease et al. Mechanical 
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synthesis methods are similar to the light directed methods except involving mechanical 
direction of fluids for deprotection and addition in the synthesis steps. 

For some embodiments, oligonucleotide arrays may be prepared having all possible probes 
of a given length. The hybridization pattern of the target sequence on the array may be used 
to reconstruct the target DNA sequence. Hybridization analysis of large numbers of probes 
can be used to sequence long stretches of DNA or provide an oligonucleotide array which is 
specific and complementary to a particular nucleic acid sequence. For example, in 
particularly preferred aspects, the oligonucleotide array will contain oligonucleotide probes 
which are complementary to specific target sequences, and individual or multiple mutations 
of these. Such arrays are particularly useful in the diagnosis of specific disorders which are 
characterized by the presence of a particular nucleic acid sequence. 

Following sample collection and nucleic acid extraction, the nucleic acid portion of the 
sample is typically subjected to one or more preparative reactions. These preparative 
reactions include in vitro transcription, labeling, fragmentation, amplification and other 
reactions. Nucleic acid amplification increases the number of copies of the target nucleic 
acid sequence of interest. A variety of amplification methods are suitable for use in the 
methods and device of the present invention, including for example, the polymerase chain 
reaction method or (PCR), the ligase chain reaction (LCR), self sustained sequence 
replication (3SR), and nucleic acid based sequence amplification (NASBA). 

The latter two amplification methods involve isothermal reactions based on isothermal 
transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA 
(dsDNA) as the amplification products in a ratio of approximately 30 or 100 to 1, respectively. 
As a result, where these latter methods are employed, sequence analysis may be carried out 
using either type of substrate, i.e. complementary to either DNA or RNA. 

Frequently, it is desirable to amplify the nucleic acid sample prior to hybridization. One of 
skill in the art will appreciate that whatever amplification method is used, if a quantitative 
result is desired, care must be taken to use a method that maintains or controls for the 
relative frequencies of the amplified nucleic acids. 



Methods of "quantitative" amplification are well known to those of skill in the art. For 
example, quantitative PCR involves simultaneously co-amplifying a known quantity of a 
control sequence using the same primers. This provides an internal standard that may be 
used to calibrate the PCR reaction. The high density array may then include probes specific 
to the internal standard for quantification of the amplified nucleic acid. 



PCR 
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Thus, in one embodiment, this invention provides for a method of optimizing a probe set for 
detection of a particular gene. Generally, this method involves providing a high density array 
containing a multiplicity of probes of one or more particular length(s) that are complementary 
5 to subsequences of the mRNA transcribed by the target gene. In one embodiment the high 
density array may contain every probe of a particular length that is complementary to a 
particular mRNA. The probes of the high density array are then hybridized with their target 
nucleic acid alone and then hybridized with a high complexity, high concentration nucleic 
acid sample that does not contain the targets complementary to the probes. Thus, for 

10 example, where the target nucleic acid is an RNA, the probes are first hybridized with their 
target nucleic acid alone and then hybridized with RNA made from a cDNA library (e.g., 
reverse transcribed polyA.sup.+ mRNA) where the sense of the hybridized RNA is opposite 
that of the target nucleic acid (to insure that the high complexity sample does not contain 
targets for the probes). Those probes that show a strong hybridization signal with their target 

1 5 and little or no cross-hybridization with the high complexity sample are preferred probes for 
use in the high density arrays of this invention. 



PCR amplification generally involves the use of one strand of the target nucleic acid 
sequence as a template for producing a large number of complements to that sequence. 
Generally, two primer sequences complementary to different ends of a segment of the 
complementary strands of the target sequence hybridize with their respective strands of the 
target sequence, and in the presence of polymerase enzymes and nucleoside triphosphates, 
the primers are extended along the target sequence. The extensions are melted from the 
target sequence and the process is repeated, this time with the additional copies of the 
target sequence synthesized in the preceding steps. PCR amplification typically involves 
repeated cycles of denaturation, hybridization and extension reactions to produce sufficient 
amounts of the target nucleic acid. The first step of each cycle of the PCR involves the 
separation of the nucleic acid duplex formed by the primer extension. Once the strands are 
separated, the next step in PCR involves hybridizing the separated strands with primers that 
flank the target sequence. The primers are then extended to form complementary copies of 
the target strands. For successful PCR amplification, the primers are designed so that the 
position at which each primer hybridizes along a duplex sequence is such that an extension 
product synthesized from one primer, when separated from the template (complement), 
serves as a template for the extension of the other primer. The cycle of denaturation, 
hybridization, and extension is repeated as many times as necessary to obtain the desired 
amount of amplified nucleic acid. 



In PCR methods, strand separation is normally achieved by heating the reaction to a 
sufficiently high temperature for a sufficient time to cause the denaturation of the duplex but 
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not to cause an irreversible denaturation of the polymerase. Typical heat denaturation 
involves temperatures ranging from about 80.degree. C. to 105.degree. C. for times ranging 
from seconds to minutes. Strand separation, however, can be accomplished by any suitable 
denaturing method including physical, chemical, or enzymatic means. Strand separation may 
be induced by a helicase, for example, or an enzyme capable of exhibiting helicase activity. 



In addition to PCR and IVT reactions, the methods and devices of the present invention are 
also applicable to a number of other reaction types, e.g., reverse transcription, nick 
translation, and the like. 

10 

The nucleic acids in a sample will generally be labeled to facilitate detection in subsequent 
steps. Labeling may be carried out during the amplification, in vitro transcription or nick 
translation processes. In particular, amplification, in vitro transcription or nick translation may 
incorporate a label into the amplified or transcribed sequence, either through the use of 
15 labeled primers or the incorporation of labeled dNTPs into the amplified sequence. 

Hybridization between the sample nucleic acid and the oligonucleotide probes upon the 
array is then detected, using, e.g., epifluorescence confocal microscopy. Typically, sample is 
mixed during hybridization to enhance hybridization of nucleic acids in the sample to nucleoc 
acid probes on the array. 

20 

In some cases, hybridized oligonucleotides may be labeled following hybridization. For 
example, where biotin labeled dNTPs are used in, e.g. amplification or transcription, 
streptavidin linked reporter groups may be used to label hybridized complexes. Such 
operations are readily integratable into the systems of the present invention. Alternatively, 
25 the nucleic acids in the sample may be labeled following amplification. Post amplification 
labeling typically involves the covalent attachment of a particular detectable group upon the 
amplified sequences. Suitable labels or detectable groups include a variety of fluorescent or 
radioactive labeling groups well known in the art. These labels may also be coupled to the 
sequences using methods that are well known in the art. 

30 

Methods for detection depend upon the label selected. A fluorescent label is preferred 
because of its extreme sensitivity and simplicity. Standard labeling procedures are used to 
determine the positions where interactions between a sequence and a reagent take place. 
For example, if a target sequence is labeled and exposed to a matrix of different probes, only 
35 those locations where probes do interact with the target will exhibit any signal. Alternatively, 
other methods may be used to scan the matrix to determine where interaction takes place. 
Of course, the spectrum of interactions may be determined in a temporal manner by 
repeated scans of interactions which occur at each of a multiplicity of conditions. However, 
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instead of testing each individual interaction separately, a multiplicity of sequence 
interactions may be simultaneously determined on a matrix. 



Means of detecting labeled target (sample) nucleic acids hybridized to the probes of the high 
density array are known to those of skill in the art. Thus, for example, where a colorimetric 
label is used, simple visualization of the label is sufficient. Where a radioactive labeled probe 
is used, detection of the radiation (e.g with photographic film or a solid state detector) is 
sufficient. 



10 In a preferred embodiment, however, the target nucleic acids are labeled with a fluorescent 
label and the localization of the label on the probe array is accomplished with fluorescent 
microscopy. The hybridized array is excited with a light source at the excitation wavelength 
of the particular fluorescent label and the resulting fluorescence at the emission wavelength 
is detected. In a particularly preferred embodiment, the excitation light source is a laser 

1 5 appropriate for the excitation of the fluorescent label. 



The target polynucleotide may be labeled by any of a number of convenient detectable 
markers. A fluorescent label is preferred because it provides a very strong signal with low 
background. It is also optically detectable at high resolution and sensitivity through a quick 
scanning procedure. Other potential labeling moieties include, radioisotopes, 
chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic 
markers, magnetic labels, and linked enzymes. 

Another method for labeling may bypass any label of the target sequence. The target may be 
exposed to the probes, and a double strand hybrid is formed at those positions only. Addition 
of a double strand specific reagent will detect where hybridization takes place. An 
intercalate dye such as ethidium bromide may be used as long as the probes themselves 
do not fold back on themselves to a significant extent forming hairpin loops. However, the 
length of the hairpin loops in short oligonucleotide probes would typically be insufficient to 
form a stable duplex. 

Suitable chromogens will include molecules and compounds which absorb light in a 
distinctive range of wavelengths so that a color may be observed, or emit light when 
irradiated with radiation of a particular wave length or wave length range, e.g., fluoresces. 
Biliproteins, e.g., phycoerythrin, may also serve as labels. 

A wide variety of suitable dyes are available, being primarily chosen to provide an intense 
color with minimal absorption by their surroundings. Illustrative dye types Include quinoline 
dyes, triarylmethane dyes, acridine dyes, alizarine dyes, phthaleins, insect dyes, azo dyes, 
anthraquinoid dyes, cyanine dyes, phenazathionium dyes, and phenazoxonium dyes. 
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A wide variety of fluorescers may be employed either by themselves or in conjunction with 
quencher molecules. Fluorescers of interest fall into a variety of categories having certain 
primary functionalities. These primary functionalities include 1- and 2-aminonaphthalene, 
5 p.p'-diaminostilbenes, pyrenes, quaternary phenanthridine salts, 9-aminoacridines, p,p- 
diaminobenzophenone imines, anthracenes, oxacarbocyanine, merocyanine, 3- 
aminoequilenin, perylene, bis-benzoxazole, bis-p-oxazolyl benzene, 1,2-benzophenazin, 
retinol, bis-3-aminopyridinium salts, hellebrigenin, tetracycline, sterophenol, 
benzimidzaolylphenylamine, 2-oxo-3-chromen, indole, xanthen, 7-hydroxycoumarin, 

10 phenoxazine, salicylate, strophanthidin, porphyrins, triarylmethanes and flavin. Individual 
fluorescent compounds which have functionalities for linking or which can be modified to 
incorporate such functionalities include, e.g., dansyl chloride; fluoresceins such as 3,6- 
dihydroxy-9-phenyixanthhydrol; rhodamineisothiocyanate; N-phenyl 1-amino-8- 
sulfonatonaphthalene; N-phenyl 2-amino-6-sulfonatonaphthalene; 4-acetamido-4- 

15 isothiocyanato-stilbene-2,2'-disulfonic acid; pyrene-3-sulfonic acid; 2-toluidinonaphthalene-6- 
sulfonate; N-phenyl, N-methyl 2-aminoaphthalene-6-sulfonate; ethidium bromide; stebrine; 
auromine-O^-^-anthroylJpalmitate; dansyl phosphatidylethanolamine; N,N'-dioctadecyl 
oxacarbocyanine; N,N*-dihexyl oxacarbocyanine; merocyanine, 4-(3 , pyrenyl)butyrate; d-3- 
aminodesoxy-equilenin; 12-(9'-anthroyl)stearate; 2-methylanthracene; 9-vinylanthracene; 

20 2,2-(vinylene-p-phenylene)bisbenzoxazole; p-bis>2-(4-methyl-5-phenyl-oxazolyl)!benzene; 6- 
dimethylamino-1,2-benzophenazin; retinol; bis(3'-aminopyridinium) 1,10-decandiyl diiodide; 
sulfonaphthylhydrazone of hellibrienin; chlorotetracycline; N-(7-dimethylamino-4-methyl-2- 
oxo-3-chromenyl)maleimide; N->p-(2-benzimidazolyl)-phenyl!maleimide; N-(4- 

fluoranthyl)maleimide; bis(homovanillic acid); resazarin; 4-chloro-7-nitro-2,1 ,3- 

25 benzooxadiazole; merocyanine 540; resorufin; rose bengal; and 2,4-diphenyl-3(2H)- 
furanone. 



Desirably, fluorescers should absorb light above about 300 nm, preferably about 350 nm, 
and more preferably above about 400 nm, usually emitting at wavelengths greater than 
about 10 nm higher than the wavelength of the light absorbed. It should be noted that the 
absorption and emission characteristics of the bound dye may differ from the unbound dye. 
Therefore, when referring to the various wavelength ranges and characteristics of the dyes, it 
is intended to indicate the dyes as employed and not the dye which is unconjugated and 
characterized in an arbitrary solvent. 

Fluorescers are generally preferred because by irradiating a fluorescer with light, one can 
obtain a plurality of emissions. Thus, a single label can provide for a plurality of measurable 
events. 
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Detectable signal may also be provided by chemiluminescent and bioluminescent sources. 
Chemiiuminescent sources include a compound which becomes electronically excited by a 
chemical reaction and may then emit light which serves as the detectible signal or donates 
energy to a fluorescent acceptor. A diverse number of families of compounds have been 
5 found to provide chemiluminescence under a variety of conditions. One family of compounds 
is 2 > 3-dihydro-1,-4-phthalazinedione. The most popular compound is tuminol, which is the 5- 
amino compound. Other members of the family include the 5-amino-6,7,8-trimethoxy- and 
the dimethylamino>ca!benz analog. These compounds can be made to luminesce with 
alkaline hydrogen peroxide or calcium hypochlorite and base. Another family of compounds 

10 is the 2,4,5-triphenylimidazoles, with lophine as the common name for the parent product. 
Chemiluminescent analogs include para-dimethylamino and -methoxy substituents. 
Chemiluminescence may also be obtained with oxalates, usually oxalyl active esters, e.g., p- 
nitrophenyl and a peroxide, e.g., hydrogen peroxide, under basic conditions. Alternatively, 
luciferins may be used in conjunction with luciferase or lucigenins to provide 

1 5 bioluminescence. 

Spin labels are provided by reporter molecules with an unpaired electron spin which can be 
detected by electron spin resonance (ESR) spectroscopy. Exemplary spin labels include 
organic free radicals, transitional metal complexes, particularly vanadium, copper, iron, and 
20 manganese, and the like. Exemplary spin labels include nitroxide free radicals. 

In addition, amplified sequences may be subjected to other post amplification treatments. 
For example, in some cases, it may be desirable to fragment the sequence prior to 
hybridization with an oligonucleotide array, in order to provide segments which are more 
25 readily accessible to the probes, which avoid looping and/or hybridization to multiple probes. 
Fragmentation of the nucleic acids may generally be carried out by physical, chemical or 
enzymatic methods that are known in the art. 

Following the various sample preparation operations, the sample will generally be subjected 
30 to one or more analysis operations. Particularly preferred analysis operations include, e.g. 
sequence based analyses using an oligonucleotide array and/or size based analyses using, 
e.g. microcapillary array electrophoresis. 

In some embodiments it may be desirable to provide an additional, or alternative means for 
35 analyzing the nucleic acids from the sample 

Microcapillary array electrophoresis generally involves the use of a thin capillary or channel 
which may or may not be filled with a particular separation medium. Electrophoresis of a 
sample through the capillary provides a size based separation profile for the sample. 
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Microcapillary array electrophoresis generally provides a rapid method for size based 
sequencing, PCR product analysis and restriction fragment sizing. The high surface to 
volume ratio of these capillaries allows for the application of higher electric fields across the 
capillary without substantial thermal variation across the capillary, consequently allowing for 
more rapid separations. Furthermore, when combined with confocal imaging methods these 
methods provide sensitivity in the range of attomoles, which is comparable to the sensitivity 
of radioactive sequencing methods. 



In many capillary electrophoresis methods, the capillaries e.g. fused silica capillaries or 
1 0 channels etched, machined or molded into planar substrates, are filled with an appropriate 
separation/sieving matrix. Typically, a variety of sieving matrices are known in the art may be 
used in the microcapillary arrays. Examples of such matrices include, e.g. hydroxyethyl 
cellulose, polyacrylamide and agarose. Gel matrices may be introduced and polymerized 
within the capillary channel. However, in some cases this may result in entrapment of 
15 bubbles within the channels which can interfere with sample separations. Accordingly, it is 
often desirable to place a preformed separation matrix within the capillary channel(s), prior to 
mating the planar elements of the capillary portion. Fixing the two parts, e.g. through sonic 
welding, permanently fixes the matrix within the channel. Polymerization outside of the 
channels helps to ensure that no bubbles are formed. Further, the pressure of the welding 
20 process helps to ensure a void-free system. 

In addition to its use in nucleic acid "fingerprinting" and other sized based analyses the 
capillary arrays may also be used in sequencing applications. In particular, gel based 
sequencing techniques may be readily adapted for capillary array electrophoresis. 

25 

In addition to detection of mRNA or as the sole detection method expression products from 
the genes discussed above may be detected as indications of the biological condition of the 
tissue. Expression products may be detected in either the tissue sample as such, or in a 
body fluid sample, such as blood, serum, plasma, faeces, mucus, sputum, cerebrospinal 
30 fluid, and/or urine of the individual. 



The expression products, peptides and proteins, may be detected by any suitable technique 
known to the person skilled in the art. 



35 In a preferred embodiment the expression products are detected by means of specific 
antibodies directed to the various expression products, such as immunofluorescent and/or 
immunohistochemical staining of the tissue. 
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Immunohistochemical localization of expressed proteins may be carried out by 
immunostaining of tissue sections from the single tumors to determine which cells expressed 
the protein encoded by the transcript in question. The transcript levels may be used to select 
a group of proteins supposed to show variation from sample to sample making a rough 
correlation between the level of protein detected and the intensity of the transcript on the 
microarray possible. 

For example sections may be cut from paraffin-embedded tissue blocks, mounted, and 
deparaffinized by incubation at 80 C° for 10 min. followed by immersion in heated oil at 60° C 
for 10 min. (Estisol 312, Estichem A/S, Denmark) and rehydration. Antigen retrieval is 
achieved in TEG (TrisEDTA-Glycerol) buffer using microwaves at 900 W. The tissue 
sections may be cooled in the buffer for 15 min before a brief rinse in tap water. Endogenous 
peroxidase activity is blocked by incubating the sections with 1% H202 for 20 min. followed 
by three rinses in tap water, 1 min each. The sections may then be soaked in PBS buffer for 
2 min. The next steps can be modified from the descriptions given by Oncogene Science 
Inc., in the Mouse Immunohistochemistry Detection System, XHCOI (UniTect, Uniondale, 
NY, USA). Briefly, the tissue sections are incubated overnight at 4° C with primary antibody 
(against beta-2 microglobulin (Dako), cytokeratin 8, cystatin-C (both from Europa, US), junB, 
CD59, E-cadherin, apo-E, cathepsin E, vimentin, IGFII (all from Santa Cruz), followed by 
three rinses in PBS buffer for 5 min each. Afterwards, the sections are incubated with 
biotinylated secondary antibody for 30 min, rinsed three times with PBS buffer and 
subsequently incubated with ABC (avidin-biotinlylated horseradish peroxidase complex) for 
30 min. followed by three rinses in PBS buffer. 

Staining may be performed by incubation with AEC (3-amino-ethylcarbazole) for 10 min. The 
tissue sections are counter stained with Mayers hematoxylin, washed in tap water for 5 min. 
and mounted with glycerol-gelatin. Positive and negative controls may be included in each 
staining round with all antibodies. 

In yet another embodiment the expression products may be detected by means of 
conventional enzyme assays, such as ELISA methods. 

Furthermore, the expression products may be detected by means of peptide/protein chips 
capable of specifically binding the peptides and/or proteins assessed. Thereby an 
35 expression pattern may be obtained. 

Assay 
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In a further aspect the invention relates to an assay for predicting the prognosis of a 
biological condition in animal tissue, comprising 



at least one first marker capable of detecting an expression level of at least one gene se- 
5 lected from the group of genes consisting of gene No. 1 to gene No. 562. 

Preferably the assay further comprises means for correlating the expression level to at least 
one standard expression level and/or at least one reference pattern. 

1 0 The means for correlating preferably includes one or more standard expression levels and/or 
reference patterns for use in comparing or correlating the expression levels or patterns ob- 
tained from a tumor under examination to the standards. 



Preferably the invention relates to an assay for determining an expression pattern of a blad- 
der cell, comprising at least a first marker and/or a second marker, wherein the first marker is 
capable of detecting a gene from a first gene group as defined above, and/or the second 
marker is capable of detecting a gene from a second gene group as defined above, correlat- 
ing the first expression level and/or the second expression level to a standard level of the 
assessed genes to predict the prognosis of a biological condition in the animal tissue. 
The marker(s) are preferably specifically detecting a gene as identified herein. 

As described above, it is preferred to determine the expression level from more than one 
gene, and correspondingly, it is preferred to include more than one marker in the assay, 
such as at least two markers, such as at least three markers, such as at least four markers, 
such as at least five markers, such as at least six markers, such as at least seven markers, 
such as at least eight markers, such as at least nine markers, such as at least ten markers, 
such as at least 15 markers. 



When using markers for at least two different groups, it is preferred that the above number of 
30 markers relate to markers in each group. 

As discussed above the marker may be any nucleotide probe, such as a DNA, RNA, PNA, or 
LNA probe capable of hybridising to mRNA indicative of the expression level. The hybridisa- 
tion conditions are preferably as described below for probes. In another embodiment the 
35 marker is an antibody capable of specifically binding the expression product in question. 

Patterns can be compared manually by a person or by a computer or other machine. An 
algorithm can be used to detect similarities and differences. The algorithm may score and 
compare, for example, the genes which are expressed and the genes which are not 
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expressed. Alternatively, the algorithm may look for changes in intensity of expression of a 
particular gene and score changes in intensity between two samples. Similarities may be 
determined on the basis of genes which are expressed in both samples and genes which are 
not expressed in both samples or on the basis of genes whose intensity of expression are 
5 numerically similar. 

Generally, the detection operation will be performed using a reader device external to the 
diagnostic device. However, it may be desirable in some cases to incorporate the data 
gathering operation into the diagnostic device itself. 

10 

The detection apparatus may be a fluorescence detector, or a spectroscopic detector, or 
another detector. 

Although hybridization is one type of specific interaction which is clearly useful for use in this 
1 5 mapping embodiment antibody reagents may also be very useful. 

Gathering data from the various analysis operations, e.g. oligonucleotide and/or 
microcapillary arrays will typically be carried out using methods known in the art. For 
example, the arrays may be scanned using lasers to excite fluorescently labeled targets that 
20 have hybridized to regions of probe arrays mentioned above, which can then be imaged 
using charged coupled devices ("CCDs") for a wide field scanning of the array. Alternatively, 
another particularly useful method for gathering data from the arrays is through the use of 
laser confocal microscopy which combines the ease and speed of a readily automated 
process with high resolution detection. 

25 

Following the data gathering operation, the data will typically be reported to a data analysis 
operation. To facilitate the sample analysis operation, the data obtained by the reader from 
the device will typically be analyzed using a digital computer. Typically, the computer will be 
appropriately programmed for receipt and storage of the data from the device, as well as for 
30 analysis and reporting of the data gathered, i.e., interpreting fluorescence data to determine 
the sequence of hybridizing probes, normalization of background and single base mismatch 
hybridizations, ordering of sequence data in SBH applications, and the like. 

The invention also relates to a pharmaceutical composition for treating a biological condition, 
35 such as bladder tumors. 

In one embodiment the pharmaceutical composition comprises one or more of the peptides 
being expression products as defined above. In a preferred embodiment, the peptides are 
bound to earners. The peptides may suitably be coupled to a polymer carrier, for example a 
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protein carrier, such as BSA. Such formulations are well-known to the person skilled in the 
art. 



The peptides may be suppressor peptides normally lost or decreased in tumor tissue admin- 
5 istered in order to stabilise tumors towards a less malignant stage. In another embodiment 
the peptides are onco-peptides capable of eliciting an immune response towards the tumor 
cells. 



In another embodiment the pharmaceutical composition comprises genetic material, either 
genetic material for substitution therapy, or for suppressing therapy as discussed below. 

In a third embodiment the pharmaceutical composition comprises at least one antibody pro- 
duced as described above. 



In the present context the term pharmaceutical composition is used synonymously with the 
term medicament. The medicament of the invention comprises an effective amount of one or 
more of the compounds as defined above, or a composition as defined above in combination 
with pharmaceutical^ acceptable additives. Such medicament may suitably be formulated 
for oral, percutaneous, intramuscular, intravenous, intracranial, intrathecal, intracerebroven- 
tricular, intranasal or pulmonal administration. For most indications a localised or substan- 
tially localised application is preferred. 

Strategies in formulation development of medicaments and compositions based on the com- 
pounds of the present invention generally correspond to formulation strategies for any other 
protein-based drug product. Potential problems and the guidance required to overcome 
these problems are dealt with in several textbooks, e.g. "Therapeutic Peptides and Protein 
Formulation. Processing and Delivery Systems", Ed. A.K. Banga, Technomic Publishing AG, 
Basel, 1995. 

Injectables are usually prepared either as liquid solutions or suspensions, solid forms suit- 
able for solution in, or suspension in, liquid prior to injection. The preparation may also be 
emulsified. The active ingredient is often mixed with excipients which are pharmaceutical^ 
acceptable and compatible with the active ingredient. Suitable excipients are, for example, 
water, saline, dextrose, glycerol, ethanol or the like, and combinations thereof. In addition, if 
desired, the preparation may contain minor amounts of auxiliary substances such as wetting 
or emulsifying agents, pH buffering agents, or which enhance the effectiveness or transpor- 
tation of the preparation. 



Formulations of the compounds of the invention can be prepared by techniques known to the 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




CT/DK2003/000750 



58 

person skilled in the art. The formulations may contain pharmaceutical^ acceptable carriers 
and excipients including microspheres, liposomes, microcapsules and nanoparticles. 

The preparation may suitably be administered by injection, optionally at the site, where the 
5 active ingredient is to exert its effect. Additional formulations which are suitable for other 
modes of administration include suppositories, and in some cases, oral formulations. For 
suppositories, traditional binders and carriers include polyalkylene glycols or triglycerides. 
Such suppositories may be formed from mixtures containing the active ingredient(s) in the 
range of from 0.5% to 10%, preferably 1-2%. Oral formulations include such normally em- 
10 ployed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, mag- 
nesium stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. These 
compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained re- 
lease formulations or powders and generally contain 10-95% of the active ingredient(s), 
preferably 25-70%. 

15 

The preparations are administered in a manner compatible with the dosage formulation, and 
in such amount as will be therapeutically effective. The quantity to be administered depends 
on the subject to be treated, including, e.g. the weight and age of the subject, the disease to 
be treated and the stage of disease. Suitable dosage ranges are of the order of several hun- 

20 dred pg active ingredient per administration with a preferred range of from about 0.1 pg to 
1000 M9» such as in the range of from about 1 pg to 300 M9, and especially in the range of 
from about 10 pg to 50 pg. Administration may be performed once or may be followed by 
subsequent administrations. The dosage will also depend on the route of administration and 
will vary with the age and weight of the subject to be treated. A preferred dosis would be in 

25 the interval 30 mg to 70 mg per 70 kg body weight. 

Some of the compounds of the present invention are sufficiently active, but for some of the 
others, the effect will be enhanced if the preparation further comprises pharmaceutical^ 
acceptable additives and/or carriers. Such additives and carriers will be known in the art. In 
30 some cases, it will be advantageous to include a compound, which promote delivery of the 
active substance to its target. 

In many instances, it will be necessary to administrate the formulation multiple times. Ad- 
ministration may be a continuous infusion, such as intraventricular infusion or administration 
35 in more doses such as more times a day, daily, more times a week, weekly, etc. 
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Vaccines 

In a further embodiment the present invention relates to a vaccine for the prophylaxis or 
treatment of a biological condition comprising at least one expression product from at least 
one gene said gene being expressed as defined above. 

5 

The term vaccines is used with its normal meaning, i.e preparations of immunogenic material 
for administration to induce in the recipient an immunity to infection or intoxication by a given 
infecting agent. Vaccines may be administered by intravenous injection or through oral, na- 
sal and/or mucosal administration. Vaccines may be either simple vaccines prepared from 
10 one species of expression products, such as proteins or peptides, or a variety of expression 
products, or they may be mixed vaccines containing two or more simple vaccines. They are 
prepared in such a manner as not to destroy the immunogenic material, although the meth- 
ods of preparation vary, depending on the vaccine. 

15 The enhanced immune response achieved according to the invention can be attributable to 
e.g. an enhanced increase in the level of immunoglobulins or in the level of T-cells including 
cytotoxic T-cells will result in immunisation of at least 50% of individuals exposed to said 
immunogenic composition or vaccine, such as at least 55%, for example at least 60%, such 
as at least 65%, for example at least 70%, for example at least 75%, such as at least 80%, 

20 for example at least 85%, such as at least 90%, for example at least 92%, such as at least 
94%, for example at least 96%, such as at least 97%, for example at least 98%, such as at 
least 98.5%, for example at least 99%, for example at least 99.5% of the individuals exposed 
to said immunogenic composition or vaccine are immunised. 

25 Compositions according to the invention may also comprise any carrier and/or adjuvant 
known in the art including functional equivalents thereof. Functionally equivalent carriers are 
capable of presenting the same immunogenic determinant in essentially the same steric 
conformation when used under similar conditions. Functionally equivalent adjuvants are ca- 
pable of providing similar increases in'the efficacy of the composition when used under simi- 

30 lar conditions. 



Therapy 

The invention further relates to a method of treating individuals suffering from the biological 
condition in question, in particular for treating a bladder tumor. 

35 

Accordingly, the invention relates to a method for reducing cell tumorigenicity or malignancy 
of a cell, said method comprising contacting a tumor cell with at least one peptide expressed 
by at least one gene selected from the group of genes consisting of gene No. 200-214, 233, 
234, 235, 236, 244, 249, 251, 252, 255, 256, 259, 261, 262, 266, 268, 269, 273, 274, 275, 
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276, 277, 279. 280, 281, 282. 285. 286. 289, 293. 295. 296. 299. 301. 304. 306. 307. 308, 
311, 312. 313. 314 . 320 . 322. 323. 325. 326, 327. 328 . 330. 331. 332, 333, 334. 338, 341. 
342. 343. 345. 348. 349. 350. 351. 352. 353. 355. 357, 360, 361, 363. 366, 367. 370. 373. 
374. 375, 376. 385, 386, 387, 389. 390, 392, 394, 398, 400, 401. 405. 406. 407. 408. 410. 
5 41 1, 412. 414. 415. 416. 418, 424, 426, 428, 433. 434. 435, 436, 438. 439. 440, 441. 442. 
443, 445, 446. 453, 460, 461. 463. 464, 465. 466. 467. 469. 470. 471, 472. 473. 475. 476, 
477, 479. 480. 481. 482. 483. 485. 486. 487, 488, 490, 492. 494, 496. 497. 498 . 499, 503, 
515, 516, 517, 521, 526, 527, 528, 530 ,532, 533. 537. 539. 540. 541. 542. 543, 545. 554. 
557, 560. 

In order to increase the effect several different peptides may be used simultaneously, such 
as wherein the tumor cell is contacted with at least two different peptides. 

In one embodiment the invention relates to a method of substitution therapy, ie. 
administration of genetic material generally expressed in normal cells, but lost or decreased 
in biological condition cells (tumor suppressors). Thus, the invention relates to a method for 
reducing cell tumorigenicity or malignancy of a cell, said method comprising 

obtaining at least one gene selected from the group of genes consisting of gene No. 200- 
214, 233, 234. 235. 236. 244. 249. 251. 252. 255. 256. 259, 261. 262. 266. 268, 269, 273, 
274, 275, 276. 277. 279, 280. 281. 282, 285. 286. 289. 293. 295. 296. 299. 301, 304. 306. 
307. 308. 311. 312. 313. 314 . 320 . 322. 323, 325, 326, 327, 328 , 330. 331, 332. 333. 334. 
338. 341. 342. 343. 345, 348. 349. 350. 351. 352. 353. 355. 357. 360. 361, 363. 366, 367. 
370. 373. 374. 375. 376. 385. 386. 387. 389, 390. 392. 394. 398, 400, 401, 405, 406, 407, 
408. 410, 411, 412, 414, 415. 416. 418. 424, 426. 428. 433. 434. 435. 436. 438, 439, 440. 
441. 442. 443. 445. 446. 453. 460. 461. 463, 464, 465, 466, 467, 469, 470, 471, 472, 473. 
475. 476. 477. 479, 480, 481. 482. 483. 485. 486. 487. 488. 490. 492. 494, 496. 497, 498 , 
499, 503. 515. 516. 517, 521. 526. 527. 528. 530 ,532. 533. 537. 539. 540. 541. 542. 543. 
545. 554. 557. 560. 

introducing said at least one gene into the tumor cell in a manner allowing expression of said 
gene(s). 

In one embodiment at least one gene is introduced into the tumor cell. In another 
35 embodiment at least two genes are introduced into the tumor cell. 

In one aspect of the invention small molecules that either inhibit increased gene expression 
or their effects or substitute decreased gene expression or their effects, are introduced to the 
cellular environment or the cells. Application of small molecules to tumor cells may be 



20 



25 



30 
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performed by e.g. local application or intravenous injection or by oral ingestion. Small 
molecules have the ability to restore function of reduced gene expression in tumor or cancer 
tissue. 



In another aspect the invention relates to a therapy whereby genes (increase and/or 
decrease) generally are correlated to disease are inhibited by one or more of the following 
methods: 



A method for reducing cell tumorigenicity or malignancy of a cell, said method comprising 

obtaining at least one nucleotide probe capable of hybridising with at least one gene of a 
tumor cell, said at least one gene being selected from the group of genes consisting of gene 
Nos. 1-ig9, 215-232, 237, 238. 239. 240. 241, 242. 243. 245, 246. 247. 248, 250. 253. 254. 
257. 258, 260, 263, 264. 265. 267, 270, 271, 272, 278, 283, 284. 287. 288. 290. 291. 292. 
294, 297. 298. 300. 302. 303. 305, 309, 310, 315, 316. 317, 318, 319, 321, 324. 329. 335. 
336. 337, 339, 340, 344, 346, 347, 354, 356. 358. 359, 362, 364. 365. 368, 369, 371. 372. 
377, 378. 379, 380, 381, 382, 383. 384. 388, 391. 393. 395. 396. 397, 399. 402. 403, 404, 
409. 413, 417, 419. 420. 421, 422, 423. 425. 427 .429, 430. 431. 432, 437, 444, 447, 448, 
449, 450. 451. 452. 454. 455 .456, 457, 458, 459, 462, 468, 474. 478. 484, 489. 491. 493. 
495. 500, 501, 502, 504, 505. 506. 507. 508. 509. 510. 511. 512. 513, 514, 518 , 519, 520, 
522, 523, 524, 525. 529. 531. 534. 535, 536, 538. 544. 546. 547. 548. 549. 550. 551. 552, 
553. 555, 556, 558, 559, 561, 562, 

introducing said at least one nucleotide probe into the tumor cell in a manner allowing the 
probe to hybridise to the at least one gene, thereby inhibiting expression of said at least one 
gene. This method is preferably based on anti-sense technology, whereby the hybridisation 
of said probe to the gene leads to a down-regulation of said gene. 

In another preferred embodiment, the method for reducing cell tumorigenicity or malignancy 
of a cell is based on RNA interference, comprising small interfering RNAs (siRNAs) 
specifically directed against at least one gene being selected from the group of genes 
consisting of gene Nos. 1-199, 215-232. 237, 238. 239. 240, 241. 242, 243, 245. 246, 247, 
248, 250. 253. 254. 257. 258. 260. 263. 264. 265. 267. 270. 271. 272. 278. 283. 284. 287, 
288. 290, 291, 292. 294. 297. 298. 300. 302. 303. 305, 309, 310. 315. 316. 317. 318. 319. 
321. 324. 329. 335. 336. 337. 339, 340, 344, 346, 347. 354. 356. 358. 359. 362. 364, 365. 
368. 369. 371. 372. 377. 378. 379. 380. 381. 382. 383, 384, 388. 391, 393. 395, 396. 397, 
399. 402. 403. 404. 409. 413. 417. 419. 420. 421. 422. 423. 425. 427 .429. 430. 431. 432. 
437. 444, 447, 448, 449, 450. 451. 452. 454, 455 .456, 457. 458. 459, 462, 468, 474. 478. 
484. 489. 491, 493. 495, 500, 501. 502, 504, 505, 506, 507, 508. 509. 510. 511. 512. 513. 
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514, 518 , 519, 520, 522, 523, 524, 525, 529, 531, 534, 535, 536, 538, 544, 546, 547, 548, 
549, 550, 551, 552, 553, 555, 556, 558, 559, 561, 562. 



The down-regulation may of course also be based on a probe capable of hybridising to 
5 regulatory components of the genes in question, such as promoters. 

The hybridization may be tested in vitro at conditions corresponding to in vivo conditions. 
Typically, hybridization conditions are of low to moderate stringency. These conditions 
favour specific interactions between completely complementary sequences, but allow some 
10 non-specific interaction between less than perfectly matched sequences to occur as well. 
After hybridization, the nucleic acids can be "washed" under moderate or high conditions of 
stringency to dissociate duplexes that are bound together by some non-specific interaction 
(the nucleic acids that form these duplexes are thus not completely complementary). 

15 As is known in the art, the optimal conditions for washing are determined empirically, often 
by gradually increasing the stringency. The parameters that can be changed to affect strin- 
gency include, primarily, temperature and salt concentration. In general, the lower the salt 
concentration and the higher the temperature the higher the stringency. Washing can be 
initiated at a low temperature (for example, room temperature) using a solution containing a 

20 salt concentration that is equivalent to or lower than that of the hybridization solution. Sub- 
sequent washing can be carried out using progressively warmer solutions having the same 
salt concentration. As alternatives, the salt concentration can be lowered and the tempera- 
ture maintained in the washing step, or the salt concentration can be lowered and the tem- 
perature increased. Additional parameters can also be altered. For example, use of a de- 

25 stabilizing agent, such as formamide, alters the stringency conditions. 

In reactions where nucleic acids are hybridized, the conditions used to achieve a given level 
of stringency will vary. There is not one set of conditions, for example, that will allow du- 
plexes to form between all nucleic acids that are 85% identical to one another; hybridization 
30 also depends on unique features of each nucleic acid. The length of the sequence, the 
composition of the sequence (for example, the content of purine-like nucleotides versus the 
content of pyrimidine-like nucleotides) and the type of nucleic acid (for example, DNA or 
RNA) affect hybridization. An additional consideration is whether one of the nucleic acids is 
immobilized (for example on a filter). 

35 

An example of a progression from lower to higher stringency conditions is the following, 
where the salt content is given as the relative abundance of SSC (a salt solution containing 
sodium chloride and sodium citrate; 2X SSC is 10-fold more concentrated than 0.2X SSC). 
Nucleic acids are hybridized at 42°C in 2X SSC/0.1% SDS (sodium dodecylsulfate; a deter- 
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gent) and then washed in 0.2X SSC/0.1% SDS at room temperature (for conditions of low 
stringency); 0.2X SSC/0.1% SDS at 42°C (for conditions of moderate stringency); and 0.1X 
SSC at 68°C (for conditions of high stringency). Washing can be carried out using only one 
of the conditions given, or each of the conditions can be used (for example, washing for 10- 
5 15 minutes each in the order listed above). Any or all of the washes can be repeated. As 
mentioned above, optimal conditions will vary and can be determined empirically. 



In another aspect a method of reducing tumoregeneicity relates to the use of antibodies 
against an expression product of a cell from the biological tissue. The antibodies may be 
10 produced by any suitable method, such as a method comprising the steps of 

obtaining expression product(s) from at least one gene said gene being expressed as 
defined above, 

15 immunising a mammal with said expression product(s) obtaining antibodies against the 
expression product. 

Use 

The methods described above may be used for producing an assay for diagnosing a 
20 biological condition in animal tissue, or for identification of the origin of a piece of tissue. 
Further, the methods of the invention may be used for prediction of a disease course and 
treatment response. 

Furthermore, the invention relates to the use of a peptide as defined above for preparation of 
25 a pharmaceutical composition for the treatment of a biological condition in animal tissue. 

Furthermore, the invention relates to the use of a gene as defined above for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal tissue. 

30 Also, the invention relates to the use of a probe as defined above for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal tissue. 

The genetic material discussed above for may be any of the described genes or functional 
parts thereof. The constructs may be introduced as a single DNA molecule encoding all of 
35 the genes, or different DNA molecules having one or more genes. The constructs may be 
introduced simultaneously or consecutively, each with the same or different markers. 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




CT/DK2003/000750 



The gene may be linked to the complex as such or protected by any suitable system nor- 
mally used for transfection such as viral vectors or artificial viral envelope, liposomes or mi- 
cellas, wherein the system is linked to the complex. 



5 Numerous techniques for introducing DNA into eukaryotic cells are known to the skilled arti- 
san. Often this is done by means of vectors, and often in the form of nucleic acid encapsi- 
dated by a (frequently virus-like) proteinaceous coat. Gene delivery systems may be applied 
to a wide range of clinical as well as experimental applications. 

10 Vectors containing useful elements such as selectable and/or amplifiable markers, pro- 
moter/enhancer elements for expression in mammalian, particularly human, cells, and which 
may be used to prepare stocks of construct DNAs and for carrying out transfections are well 
known in the art. Many are commercially available. 

15 Various techniques have been developed for modification of target tissue and cells in vivo. A 
number of virus vectors, discussed below, are known which allow transfection and random 
Integration of the virus into the host. See, for example, Dubensky et al. (1984) Proc. Natl. 
Acad. Sci. USA 81:7529-7533; Kaneda et al., (1989) Science 243:375-378; Hiebert et al. 
(1989) Proc. Natl. Acad. Sci. USA 86:3594-3598; Hatzoglu et al., (1990) J. Biol. Chem. 

20 265:17285-17293; Ferry et al. (1991) Proc. Natl. Acad. Sci. USA 88:8377-8381. Routes and 
modes of administering the vector include injection, e.g intravascularly or intramuscularly, 
inhalation, or other parenteral administration. 

Advantages of adenovirus vectors for human gene therapy include the fact that recombina- 
25 tion is rare, no human malignancies are known to be associated with such viruses, the ade- 
novirus genome is double stranded DNA which can be manipulated to accept foreign genes 
of up to 7.5 kb in size, and live adenovirus is a safe human vaccine organisms. 

Another vector which can express the DNA molecule of the present invention, and is useful 
30 in gene therapy, particularly in humans, is vaccinia virus, which can be rendered non- 
replicating (U.S. Pat. Nos. 5,225,336; 5,204,243; 5,155,020; 4,769,330). 

Based on the concept of viral mimicry, artificial viral envelopes (AVE) are designed based on 
the structure and composition of a viral membrane, such as HIV-1 or RSV and used to de- 
35 liver genes into cells in vitro and in vivo. See, for example, U.S. Pat. No. 5,252,348, Schreier 
H. et al., J. Mol. Recognit, 1995, 8:59-62; Schreier H et al., J. Biol. Chem., 1994, 269:9090- 
9098; Schreier, H., Pharm. Acta Helv. 1994, 68:145-159; Chander, R et al. Life Sci., 1992, 
50:481-489, which references are hereby incorporated by reference in their entirety. The 
envelope is preferably produced in a two-step dialysis procedure where the "naked" enve- 
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lope is formed initially, followed by unidirectional insertion of the viral surface glycoprotein of 
interest. This process and the physical characteristics of the resulting AVE are described in 
detail by Chander et al. f (supra). Examples of AVE systems are (a) an AVE containing the 
HIV-1 surface glycoprotein gp160 (Chander et al., supra; Schreier et aL, 1995, supra) or 
5 glycosyl phosphatidylinositol (GPI)-linked gp120 (Schreier et al., 1994, supra), respectively, 
and (b) an AVE containing the respiratory syncytial virus (RSV) attachment (G) and fusion 
(F) glycoproteins (Stecenko, A. A. et aL, Pharm. Pharmacol. Lett. 1:127-129 (1992)). Thus, 
vesicles are constructed which mimic the natural membranes of enveloped viruses in their 
ability to bind to and deliver materials to cells bearing corresponding surface receptors. 

10 

AVEs are used to deliver genes both by intravenous injection and by instillation in the lungs. 
For example, AVEs are manufactured to mimic RSV, exhibiting the RSV F surface glycopro- 
tein which provides selective entry into epithelial cells. F-AVE are loaded with a plasmid cod- 
ing for the gene of interest, (or a reporter gene such as CAT not present in mammalian tis- 
1 5 sue). 



The AVE system described herein in physically and chemically essentially identical to the 
natural virus yet is entirely "artificial", as it is constructed from phospholipids, cholesterol, and 
recombinant viral surface glycoproteins. Hence, there is no carry-over of viral genetic infor- 
mation and no danger of inadvertant viral infection. Construction of the AVEs in two inde- 
pendent steps allows for bulk production of the plain lipid envelopes which, in a separate 
second step, can then be marked with the desired viral glycoprotein, also allowing for the 
preparation of protein cocktail formulations if desired. 

Another delivery vehicle for use in the present invention are based on the recent description 
of attenuated Shigella as a DNA delivery system (Sizemore, D. R. et aL, Science 270:299- 
302 (1995), which reference is incorporated by reference in its entirety). This approach ex- 
ploits the ability of Shigellae to enter epithelial cells and escape the phagocytic vacuole as a 
method for delivering the gene construct into the cytoplasm of the target cell. Invasion with 
as few as one to five bacteria can result in expression of the foreign plasmid DNA delivered 
by these bacteria. 



A preferred type of mediator of nonviral transfection in vitro and in vivo is cationic (ammo- 
nium derivatized) lipids. These positively charged lipids form complexes with negatively 
charged DNA, resulting in DNA charged neutralization and compaction. The complexes en- 
docytosed upon association with the cell membrane, and the DNA somehow escapes the 
endosome, gaining access to the cytoplasm. Cationic lipid:DNA complexes appear highly 
stable under normal conditions. Studies of the cationic lipid DOTAP suggest the complex 
dissociates when the inner layer of the cell membrane is destabilized and anionic lipids from 
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the inner layer displace DNA from the cationic lipid. Several cationic lipids are available 
commercially. Two of these, DMRI and DC-cholesterol, have been used in human clinical 
trials. First generation cationic lipids are less efficient than viral vectors. For delivery to lung, 
any inflammatory responses accompanying the liposome administration are reduced by 
5 changing the delivery mode to aerosol administration which distributes the dose more 
evenly. 

Drug screening 

Genes identified as changing in various stages of bladder cancer can be used as markers for 
10 drug screening. Thus by treating bladder cancer cells with test compounds or extracts, and 
monitoring the expression of genes identified as changing in the progression of bladder 
cancers, one can identify compounds or extracts which change expression of genes to a 
pattern which is of an earlier stage or even of normal bladder mucosa. 

15 It is also within the scope of the invention to use small molecules in drug screening. 

The following are non-limiting examples illustrating the present invention. 

EXAMPLES 

20 

Example 1 

Identification of a molecular signature defining disease progression in patients with 
superficial bladder carcinoma 

25 Patient samples 

Bladder tumor biopsies were obtained directly from surgery after removal of the necessary 
amount of tissue for routine pathology examination. The tumors were frozen at -80°C in a 
guanidinium thiocyanate solution for preservation of the RNA. Informed consent was ob- 
tained in all cases, and the protocols were approved by the scientific ethical committee of 

30 Aarhus County. The samples for the no progression group were selected by the following 
criteria: a) Ta or T1 tumors with no prior higher stage tumors; b) a minimum follow up period 
of 12 months to the most recent routine cystoscopy examination of the bladder with no oc- 
currence of tumors of higher stage. The samples for the progression group were selected by 
two criteria: a) Ta or T1 tumors with no prior higher stage tumors; b) subsequent progression 

35 to a higher stage tumor, see Table 1 . 

Table 1. Clinical data on all patients involved in the study 
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Training set 



Group 


Sample 


Hist. 


Progressed 
to: 


Time to 
progression 


Fol low- 
up time 
months 


No prog. 


150-6 


Ta gr3 




- 


44 


No prog. 


997-1 


Ta gr2 




- 


24 


No prog. 


833-2 


Tagr3 




- 


35 


No prog. 


1070-1 


Ta gr3 




- 


33 


No prog. 


968-1 


Tagr2 




- 


26 


No prog. 


625-1 


T1 gr3 




- 


12 


No prog. 


880-1 


T1 gr3 




- 


47 


No prog. 


815-1 


Tagr2 




- 


49 


No prog. 


861-1 


Tagr2 




- 


45 


No prog. 


669-1 


Ta gr2 




- 


55 


No prog. 


368-4 


Tagr2 




- 


16 
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898-1 
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- 


17 
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576-6 
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- 


36 
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747-3 
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T1 gr3 


6 
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956-2 


Tagr3 


T1 gr3 


27 


• 


Prog. 


1083-1 


Ta gr2 


T1 gr3 


1 


- 


Prog. 


686-3 


Tagr2 


T1 gr2 


6 


• 


Prog. 


795-13 


Tagr2 


T1 gr3 


4 


- 


Prog. 


865-1 


Ta gr2 


T1 gr2 


5 


- 


Prog. 


112-2 
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T1 gr3 


7 


- 
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6 


- 
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31 
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10 
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3 




Prog. 
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8 
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14 




Prog. 
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Test set 



Group Sample Hist Progressed Time to Follow- 

to: progression up time 

months 
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1008-1 
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55 
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1060-1 
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48 


No prog. 


1086-1 
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34 
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1105-1 
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31 


No prog. 


1145-1 


Tagr2 






39 


No prog. 


1352-1 


Tagr2 






26 
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Prog. 
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1 





Delineation of non-progressing tumors from progressing tumors 

To delineate non-progressing tumors from progressing tumors we now profiled a total of 29 
5 bladder tumor samples; 13 early stage bladder tumor samples without progression (median 
follow-up time 35 months) and 16 early stage bladder tumor samples with progression (me- 
dian time to progression 7 months). See Table 1 for description of patient disease courses. 
We analyzed gene expression changes between the two groups of tumors by hybridizing the 
labeled RNA samples to customized Affymetrix GeneChips with 59,000 probe-sets to cover 
10 virtually the entire transcriptome (-95% coverage). Low expressed and non-varying probe- 
sets were eliminated from the data set and the resulting 6,647 probe-sets that showed varia- 
tion across the tumor samples were subjected to further analysis. These probe-sets repre- 
sent 5,356 unique genes (Unigene clusters). 

1 5 Gene expression similarities between tumor biopsies 

We analyzed gene expression similarities between the tumor biopsies using unsupervised 
hierarchical cluster analysis (Fig. 1). This showed a notable distinction between the non- 
progressing and the progressing tumors when using the 3,197 most varying probe-sets (s.d. 
£ 75) for clustering (4 errors; x 2 test, P = 0.0001). Using other gene-sets based on different 

20 gene variation criteria demonstrated the same distinction between the tumor groups. Two of 
the samples that show later progression (825-3 and 112-2) were found in the non- 
progression branch of the cluster dendrogram and two of the non-progressing samples (815- 
1 and 150-6) were found in the progression branch. This distinct separation of the samples 
indicated a considerable biological difference between the two groups of tumors. Notably, 

25 the T1 tumors did not cluster separately from Ta tumors; however, they did form a sub- 
cluster in the progressing branch of the dendrogram. Based on this we decided to look for a 
general signature of progression disregarding pathologic staging of the tumors. 
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Selection of the 100 most significantly up-regulated genes In each group using t-test 
statistics 

We delineated the non-progressing tumors from the progressing tumors by selecting the 100 
most significantly up-regulated genes in each group using t-test statistics (Fig. 2 and Table 
5 2). Among the genes up regulated in the non-progressing group we found the SERPINB5 
and FAT tumor suppressor genes and the FGFR3 gene, which has been shown to be fre- 
quently mutated in superficial bladder tumors with low recurrence rates (van Rhijn et al. 
2001). Among the genes up regulated in the progressing group we found the PLK (Yuan et 
al. 1997), CDC25B (Galaktionov et al. 1991), CDC20 (Weinstein et al. 1994) and MCM7 

10 (Hiraiwa et al. 1997) genes, which are involved in regulating cell cycle and cell proliferation. 
Furthermore, in this group we identified the WHSC1, DD96 and GRB7 genes, which have 
been predicted/computed (Gene Ontology) to be involved in oncogenic transformation. An- 
other interesting candidate in this group is the NRG1 gene, which through interaction with 
the HER2/HER3 receptors has been found to induce differentiation of lung epithelial cells 

15 (Liu & Kern 2002). The PPARD gene was also identified as up regulated in the tumors that 
show later progression. Disruption of this gene was found to decrease tumorigenicity in colon 
cancer cells (Park et al. 2001). Furthermore, PPARD regulates VEGF expression in bladder 
cancer cell lines (Fauconnet et al. 2002). 

20 Table 2. The 200 best markers of progression 



Eos 

Hu03 ID 


Unigene 
Build 133 


Description 


T4est 


5% 
perm 


Exemplar 
accession# 


416640 


Hs.79404 


neuron-specific protein 


6.03 


5.62 


BE262478 


442220 


Hs.8148 


selenoprotein T 


5.98 


5.06 


AL037800 


426982 


Hs.173091 


ubiquitin-like 3 


5.9 


4.88 


AA1 49707 


416815 


Hs.80120 


UDP-N-acetyl-alpha-D-galactosamine:polypeptide N- 
acetylgalactosaminyitransferase 1 (GalNAc-T1) 


5.52 


4.67 


U41514 


435521 


Hs.6361 


mitogen-actvated protein kinase kinase 1 interacting protein 
1 


5.24 


4.51 


W23814 


447343 


Hs.236894 


ESTs, Highly similar to S02392 alpha-2-macroglobulin re- 
ceptor precursor [H.sapiens] 


5.23 


4.44 


AA256641 


452829 


Hs.63368 


ESTs, Weakly similar to TRHYJHUMAN TRICHOHYALI 
[H.sapiens] 


4.95 


4.39 


AI955579 


414895 


Hs.1 16278 


Homo sapiens cONA FU13571 fis, clone PLACE1008405 


4.94 


4.31 


AW894856 


426252 


Hs.28917 


ESTs 


4.9 


4.26 


BE1 76980 


444604 


Hs.11441 


chromosome 1 open reading frame 8 


4.69 


4.17 


AW327695 


409632 


Hs.55279 


serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), 
member 5 


4.89 


4.13 


W74001 


446556 


Hs.15303 


KIAA0349 protein 


4.87 


4.08 


AB002347 


426799 


Hs.303154 


popeye protein 3 


4.86 


4.03 


H14843 


428115 


Hs.300855 


KIAA0977 protein 


4.86 


4.00 


AB023194 


419847 


Hs. 184544 


Homo sapiens, clone IMAGE:3355383. mRNA, partial cds 


4.82 


3.97 


AW390601 


417839 


Hs.82712 


fragile X mental retardation, autosomal homolog 1 


4.8 


3.93 


AI815732 
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428284 


Hs.183435 


NM 004545: Homo sapiens NADH dehydrogenase 
(ubiquinone) 1 beta subcomplex, 1 (7kD, MNLL) (NDUFB1), 
mRNA. 


4.78 


3.92 


AA535762 


422929 


Hs.94011 


ESTs, Weakly similar to MGB4JHUMAN MELANOMA- 
ASSOCIATED ANTIGEN B4 [H.sapiens] 


4.77 


3.90 




414762 


Hs.77257 


KIAA0068 protein 


4.72 


3.66 


AW068349 


453395 


Hs.377915 


mannosidase, alpha, class 2A, member 1 


4.71 


3.84 


D63998 


421311 


Hs.283609 


hypothetical protein PRO2032 


4.65 


3.82 


N71848 


446847 


Hs.82845 


Homo sapiens cDNA: FU21930 fis, clone HEP04301, highly 
similar to HSU90916 Human clone 23815 mRNA sequence 


4.65 


3.82 


T51454 


413840 


Hs.356228 


RNA binding motif protein, X chromosome 


4.62 


3.79 


Al 301 558 


418321 


Hs.84087 


KIAA01 43 protein 


4.62 


3.78 


D63477 


430604 


Hs.247309 


succinate-CoA ligase, GDP-forming, beta subunit 


4.61 


3.74 


AV650537 


423185 


Hs.380062 


ornithine decarboxylase antizyme 1 


4 fi1 


1 74 




417615 


Hs.82314 


hypoxanthine phosphoribosyltransferase 1 (Lesch-Nyhan 
syndrome) 


4 6 


1 70 




418504 


Hs.85335 


Homo sapiens mRNA; cDNA DKFZp564D1462 (from clone 
DKFZp564D1462) 


4.59 


1 aa 


RF1SQ71 A 


400846 




sorti I in-related receptor, L(DLR class) A repeats-containing 
(SORL1) 


4.57 






426028 


Hs.172028 


a disintegrin and metalloproteinase domain 10 (ADAM 10) 


4.53 


3.65 


MM 00111 

o 


425243 


Hs. 155291 


KIAA0005 gene product 


4.47 


3.63 


N89487 


434978 


Hs.4310 


eukaryotic translation initiation factor 1 A 


4.45 


3.62 


AA321238 


409513 


Hs. 54642 


methionine adenosyl transferase II, beta 


4.43 




AWQfifi70A 
MVVoDOf 


433282 


Hs.49007 


hypothetical protein 


4.43 




Dljoj iU 1 


421628 


Hs. 106210 


hypothetical protein FLJ10813 


4.37 


3.56 


AL121317 


452170 


Hs.28285 


patched related Drotein translocated in renal canepr 


4 17 




a cr\a.A an <i 


440014 


Hs.6856 


ash2 (absent, small, or homeotic, Drosophila, homolog)-like 


4 17 


1 S9 




431857 


Hs.271742 


AD P-ribosvltransf erase fNAD* oofv fADP-ribosel doIv- 
meraseH'ke 3 


4.36 


3.52 


W19144 


417924 


Hs.82932 


cydin D1 (PRAD1: parathyroid adenomatosis 1) 


4 IS 






421733 


HS.1420 


fibroblast growth factor receptor 3 (achondroplasia, thanato- 
phoric dwarfism) 


4.34 


3.50 


AL1 1Q671 


440197 


Hs.317714 


pallid (mouse) homolog, pallidln 


4.32 


3 4Q 


MVV 0*tU # UO 


434055 


Hs.3726 


x 003 protein 


4.32 


3 4fl 


r\r 1 QO » 1 £. 


445831 


Hs.13351 


LanC (bacterial (antibiotic synthetase component cy\\ke 1 


4.31 


3.46 


kim nnfiHA 

IN IVI__UUOUO 

5 


439632 


HS.334437 


hypothetical protein MGC4248 




1 4*> 


/\W«tlU# l*» 


448813 


Hs.22142 


cytochrome b5 reductase b5R.2 


4.28 


3.44 


AF1 69802 


449268 


Hs.23412 


hypothetical protein FU20160 


4.28 


3.43 


AW369278 


429311 


Hs.198998 


conserved helix-loop-helix ubiquitous kinase 


4.28 


3.42 


AF080157 


423599 


Hs.31731 


peroxiredoxin 5 


4.27 


3.41 


AI805664 


422913 


Hs.121599 


CGI -18 protein 


4.26 


3.40 


NMJ)1594 
7 


418127 


Hs.83532 


membrane cofactor protein (CD46, trophoblast-fymphocyte 
cross-reactive antigen) 


4.26 


3.39 


BE243982 
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425221 


Hs. 1551 88 


TATA box binding protein (TBP)-associated factor, RNA 
polymerase II, F, 55kD 


4.25 


3.38 


AV649864 


426682 


Hs.2056 


UDP glycosyltransferase 1 family, polypeptide A9 


4.23 


3.37 


AV660038 


421101 


Hs.101840 


major histocompatibility complex, class Mike sequence 


4.23 


3.37 


AF010446 


444037 


Hs.380932 


CHMP1.5 protein 


4.22 


3.35 


AV647686 


443407 


Hs.348514 


ESTs, Moderately similar to 2109260A B cell growth factor 
[H.sapiens] 


4.21 


3.35 


AA037683 


448625 


Hs.1 78470 


hypothetical protein FU22662 


4.21 


3.34 


AW970786 


450997 


Hs.35254 


hypothetical protein FLB6421 


4.16 


3.34 


AW580830 


444336 


Hs. 10882 


HMG-box containing protein 1 


4.15 


3.33 


AFO 19214 


416977 


Hs. 4061 03 


hypothetical protein FKSG44 


4.14 


3.32 


AW 130242 


420613 


Hs.406637 


ESTs, Weakly similar to A47582 B*cell growth factor precur- 
sor [H.sapiens] 


4.13 


3.31 


AI873871 


414843 


Hs.77492 


heterogeneous nuclear ribonucleoprotein AO 


4.1 


3.30 


BE386038 


408288 


Hs.16886 


gb:z!73d06;r1 Stratagene colon (937204) Homo sapiens 
cDNA clone 5', mRNA sequence 


4.09 


3.29 


AA053601 


422043 


Hs.1 10953 


retinoic acid induced 1 


4.09 


3.29 


AL1 33649 


432864 


Hs.359682 


calpastatin 


4.08 


3.28 


D16217 


410047 


Hs.379753 


zinc finger protein 36 (KOX 18) 


4.06 


3.28 


Al 167810 

/»■ ■ w» V IV/ 


400773 




NM_003105*:Homo sapiens sortilin-related receptor, L(DLR 
class) A repeats-containing (SORL1), mRNA. 


4.06 


3.27 




423960 


Hs.1 36309 


SH3-contalning protein SH3GLB 1 


4.05 


3.27 




449626 


Hs.1 12860 


zinc finger protein 258 


4.04 


3.27 


AA774247 


429953 


Hs.226581 


COX15 (yeast) homolog, cytochrome c oxidase assembly 
protein 


4.04 


3.24 


mm nfLd37 


428901 


Hs.146668 


KIAA1253 protein 


4.02 


3.24 


Al 929568 


420079 


Hs.94896 


PTD011 protein 


3.99 


3.22 


NM 01405 

1 


436576 


Hs.77542 


ESTs 


3.98 


3.21 


AI458213 


412841 


Hs.101395 


hypothetical protein MGC11352 


3.97 


3.21 


AI751157 


431604 


Hs.264190 


vacuolar protein sorting 35 (yeast homolog) 


3.96 


3.21 


AF1 75265 


428318 


Hs.356190 


ubiquitin B 


3.96 


3.19 


BE3001 10 

UkwUV 1 IV/ 


430677 


Hs.359784 


desmoglein 2 


3.95 


3.19 


Z26317 


407955 


Hs.9343 


ESTs 


3.94 


3.18 


BE53673Q 


426177 


Hs.1 67700 


Homo sapiens cONA FU10174 fis, clone HEMBA1 003959 


3.92 


3.17 


AA373452 


429802 


Hs.5367 


ESTs, Weakly similar to I38022 hypothetical protein 
[H.sapiens] 


3.92 


3.17 


H09548 


423810 


Hs. 132955 


BCL2/adenovirus E1B 19kD-interacting protein 3-like 


3.92 


3.16 


Al 13?fifi5 


421475 


Hs.1 04640 


HIV-1 inducer of short transcripts binding protein; lymphoma 
related factor 


3.91 


3 15 


MrUUUODl 


436472 


Hs.46366 


KIAA0948 protein 


3.91 


3.14 


AL045404 


434263 


Hs.79187 


ESTs 


3.9 


3.13 


N34895 


400843 




NM_003105*:Homo sapiens sortilin-related receptor, L(DLR 
class) A repeats-containing (SORL1 ), mRNA. 


3.9 


3.13 




440357 


Hs.20950 


phospholysine phosphohistidine inorganic pyrophosphate 
phosphatase 


3.89 


3.12 


AA379353 


437223 


Hs.330716 


Homo sapiens cDNA FU14368 fis. clone HEMBA1001 122 


3.88 


3.12 


C15105 
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426125 


Hs.166994 


FAT tumor suppressor (Drosophila) homolog 


3.86 


3.11 


X87241 


432554 


Hs.278411 


NCK-associated protein 1 


3.86 


3.10 


AI479813 


422506 


Hs.300741 


sorcin 


3.85 


3.10 


Rorjonq 


413786 


Hs. 13500 


ESTs 


3.83 


3.09 


AWR1 ^7 AO 

AVVO 1 Of OU 


429561 


Hs.250646 


baculoviral IAP repeat-containing 6 


3.83 


3.08 


AF2fWW> 


404977 


_ 


Insulin-like growth factor 2 (somatomedin A) (IGF2) 


3.83 


3.08 




427722 


Hs.1 80479 


hypothetical protein FLJ20116 


3.82 


3.08 


AK00019'* 


400844 




NM_003105*:Homo sapiens sortffln-related receptor, L(DLR 
class) A repeats-containing (SORL1), mRNA. 


3.82 


3.08 




426469 


Hs.363039 


methyimalonate-semialdehyde dehydrogenase 


3.81 


3.07 


RP9Q7RRR 


439578 


Hs.350547 


nuclear receptor co-repressor/HDAC3 complex subunit 


3.81 


3.06 


A\A/9fV*19A 
MW£DOl£4 


426508 


Hs.170171 


glutamate-ammonia ligase (glutamine synthase) 


3.8 


3.06 


W 9*^1 Ail 


448524 


Hs.21356 


hypothetical protein DKFZp762K2015 


3.79 


3.06 


AB032948 


448357 


Hs.108923 


RAB38, member RAS oncogene family 


3.79 


3.06 


N20169 


425097 


Hs.1 54545 


PDZ domain contain] no auanine nucleotide exchange far- 
tor(GEF)1 


3.77 


^ n^ 


Kill A n<4il0.4 

NMJU1424 
f 


421649 


Hs.106415 


peroxisome proliferative activated receptor, delta 


5.76 


J.JU 


AA791017 


427747 


Hs.180655 


serine/threonine kinase 12 


5.41 




1449 


439010 


Hs.75216 


Homo sapiens cDNA FLJ13713 fis, done PLACE2000398, 
moderately similar to LAR PROTEIN PRECURSOR (LEU- 
KOCYTE ANTIGEN RELATED) (EC 3.1.3.48) 


4.57 


4 80 


r\V V I r UOo£ 


438818 


Hs.30738 


ESTs 


4.49 


4.59 


MV V 9 / 9UUO 


438013 


Hs.15670 


ESTs 


4.42 


4 'SO 


Ainnoi ar 


452929 


Hs.1 7281 6 


neuregulin 1 


4.37 


4.40 


MV V 03<»9OO 


404826 


_ 


Target Exon 


4.22 


4 X> 




429124 


Hs.196914 


minor histocompatibility antigen HA-1 


4.2 


4 9fi 


MWOUOUOD 


421505 


Hs.285641 


KIAA1111 protein 


4.16 


4 24 


AW94QQQ4 


428712 


Hs. 190452 


KIAA0365 gene product 


4.14 


4 1Q 


nWUOOlOl 


427239 


Hs.356512 


ubiquitin carrier protein 


4.11 


4 m 




421595 


Hs.301685 


KIAA0620 protein 


4.1 


4.07 


AB014520 


433844 


Hs.179647 


Homo sapiens cONA FU12195 fis, clone MAM MA1 000865 I 


4 04 




AAOiO I/O 


443679 


Hs.9670 


hypothetical protein FU10948 


4.01 


4 nn 


MIVUU IDIU 


422959 


Hs.349256 


paired immunoglobulin-like receptor beta 


4.01 


3.98 


AV647015 


452012 


Hs.279766 


kinesin family member 4A 


o.ao 


J.9t> 


AA3U77U3 


435320 


Hs.1 17864 


ESTs 


Q7 


o.yi 


A AC77fiO>l 

AAo77y34 


456332 


Hs.399939 


gb:nc39d05.M NCLCGAP__Pr2 Homo sapiens cDNA clone. 
mRNA sequence 




1 QO 

o.oo 


AA228357 


427999 


Hs.1 81 369 


ubiquitin fusion degradation 1-like 


3.94 


3.86 


AI435128 


427681 


Hs.284232 


tumor necrosis factor recentor sun&rtamiix/ momhar 10 
(translocating chain-association membrane protein) 




Q 04 
J.OI 


AB018263 


413929 


Hs.75617 


collagen, type IV, alpha 2 


3.93 


3.79 


BE501689 


420116 


Hs.95231 


FH1/FH2 domain-containing protein 


3.9 


3.77 


NMJH324 
1 


433914 


Hs.1 12160 


Homo sapiens DNA helicase homolog (PIF1) mRNA, partial 
cds 


3.88 


3.75 


AF108138 


420732 


Hs.367762 


ESTs 


3.87 


3.74 


AA789133 


452517 




gb:RC-BT068-1 30399-068 BT068 Homo sapiens cDNA, 


3.84 


3.70 


AI904891 
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mRNA sequence 








437524 


Hs.385719 


ESTs 


3.82 


3.68 


MlQZr OOO 


435158 


Hs.65588 


DAZ associated protein 1 


I 3.8 


3.66 


AVVOuOJ 1 1 


448780 


Hs.267749 


Human DNA sequence from done 366N23 on chromosome 
6q27. Contains two genes similar to consecutive parts of the 
C. elegans UNC-93 (protein 1, C46F1 1.1) gene, a KIAA0173 
and Tubulin-Tyrosine Ugase LIKE gene, a Mitotic Feedback 
Control Protein MADP2 H 


3.8 


3.65 


W92071 


445084 


Hs.250848 


hypothetical protein FLJ14761 


3.79 


3.64 


H38914 


423138 




gb:EST385571 MAGE resequences, MAGM Homo sapiens 
cDNA, mRNA sequence 


3.75 


3.60 


AW973426 


419602 


Hs.91521 


hypothetical protein 


3.74 


3.59 


AW248434 


442549 


Hs.8375 


TNF receptor-associated factor 4 


3.74 


3.58 


AI751601 


450893 


Hs.25625 


hypothetical protein FU11323 


3.73 


3.55 


AK002185 


414223 


Hs.238246 


hypothetical protein FLJ22479 


3.73 


3.55 


AA954566 


444312 


Hs.351142 


ESTs 


3.72 


3.53 


R44007 


425205 


Hs.155106 


receptor (calcitonin) activity modifying protein 2 


3.71 


3.51 


NM 00585 
4 


432327 


Hs.274363 


neuroglobin 


3.71 


3.49 


R36571 


451970 


Hs.211046 


ESTs 


3.67 


3.48 




408049 


Hs.345588 


desmoplakin (DPI, DPII) 


3.67 


3.45 




440100 


Hs. 158549 


ESTs, Weakly similar to T2D3_HUMAN TRANSCRIPTION 
INITIATION FACTOR TFIID 135 KDA SUBUNIT [H.sapiens] 


3.66 


3.45 


BE382685 


426468 


Hs. 11 7558 


ESTs 


3.65 


3.43 


nnO f C70UD 


402384 




NM_G07181*:Homo sapiens mitogen-activated protein 
kinase kinase kinase kinase 1 (MAP4K1 ), mRNA. 


3.64 






458132 


Hs. 103267 


hypothetical protein FU22546 similar to gene trap PAT 12 


3.64 


AO 


A\A/9A7f»1 9 
M¥V**»fUl4 


447400 


Hs.18457 


hypothetical protein FU20315 


3.64 






443893 


Hs. 11 5472 


ESTs, Weakly similar to 2004399A chromosomal protein 
[H.sapiens] 


3.63 


O.** 1 


Dcu/youz 


424959 


Hs. 153937 


activated p21cdc42Hs kinase 


3.62 


3.40 


NM_00578 


409586 


Hs.55044 


DKF2P586H2123 protein 


3.6 


3 39 




445692 


Hs. 182099 


ESTs 


3.6 


3.37 


AI248322 


433052 


Hs.293003 


ESTs, Weakly similar to PC4259 ferritin associated protein 
[H.sapiens] 


3.6 


o.ou 


AWQ7 1 Qfre 


421782 


Hs. 108258 


actin binding protein; macrophin (microfilament and actin 
filament cross-linker protein) 


3.59 


3.35 


AB029290 


414907 


Hs.77597 


polo (DrosophiaHike kinase 


J.JO 




Y0fl70*» I 
AaU ( ZD 


454639 




gb:RC2-ST0158-091099-011-d05 ST0158 Homo sapiens 
cDNA, mRNA sequence 


3.57 


3.33 


AW811633 


434547 


Hs.106124 


ESTs 


3.56 


3.32 


R26240 


439130 


Hs.375195 


ESTs 


3.55 


3.32 


AA306090 


413564 




gb:601146990F1 NIH_MGC_19 Homo sapiens cDNA clone 
5\ mRNA sequence 


3.54 


3.31 


BE260120 


443471 


Hs.398102 


Homo sapiens clone FLB3442 PRO0872 mRNA, complete 
cds 


3.53 


3.31 


AW236939 
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i 424415 


Hs. 146580 


enolase 2, (gamma, neuronal) 


3.52 


3.30 


NM_00197 
5 


405036 




NM 021628*'Homo saoiens arachidonata liooxvaana^a 3 
(ALOXE3), mRNA. VERSION NMJ)20229.1 Gl 


3.52 


3.29 




422068 


Hs.104520 


Homo sapiens cDNA FLJ 13694 fis, clone PLACE20001 15 


3 52 


3 2Q 

O.Z27 


AIOA7C4 ft 

Al 00 70 1 9 


424244 


Hs.143601 


hypothetical protein hCLA-iso 


3 52 


3 2ft 


AVq471o4 


451867 


Hs.27192 


hypothetical protein dJ1057B20.2 


3 51 


3 2ft 


Wf4i57 


429187 


Hs.1 63872 


ESTs. Weaklv similar to S65657 aloha- IC-adrenarriir reran* 
tor splice form 2 [H .sapiens] 


3.49 


3.26 


AA447648 


415200 


Hs.78202 


SWI/SNF related matrix associated artin HpnonHont ram iia 
tor of chromatin, subfamily a, member 4 


•a AO 


3 25 
O.Z3 


ALU4Q3ZS 


405667 




Target Exon 


3 4A 


3 25 




421075 


Hs.1 01474 


KIAA0807 protein 


3 47 


3 23 


AdU1o3dO 


424909 


Hs. 153752 


cell division cycle 25B 


3.46 


3 22 


OfOlOf 


451164 


Hs.60659 


ESTs, Weakly similar to T46471 hypothetical protein 
DKFZp434L0130.1 [Ksapiens] 


3.46 


3.21 


AA015912 


438644 


Hs.1 29037 


ESTs 




3 2n 


AM Oft i AO 
Mil ZO IOZ 


432258 


Hs.293039 


ESTs 


3 A*? 


3 1Q 


AWyr 3U# 0 


411817 


Hs.72241 


mi tog en -activated protein kinase kinase 2 




3 1Q 


Dtouzyuu 


414918 


Hs.72222 


hypothetical protein FU13459 


3.45 


3.18 


AI219207 


437256 


Hs.97671 


Homo SaDlens Clone IMAGE'3845253 mRNIA narKal rHe 




3.1/ 


AL 137404 


404208 




C6001282'Oil4504223lreflNP 000179 11 nli in irnnirlacn Koto 
[Homo sapiens] gi|114963|sp|P082 


3 42 


3 1ft 




421989 


Hs.1 10457 


Wdf-Hirschhom syndrome candidate 1 


3.4 


3.15 


AJ007042 


438942 


Hs.6451 


PRO0659 orotein 




3.14 


AW875398 


412649 


Hs.74369 


integrin, alpha 7 


3.38 


3.14 


NM_0v22U 
6 


414840 


Hs.23823 


hairv/enhanoer-of-snlit rplatad with YRPW mntifJika 

■ tan jf/ci ti lai i\Ast o|Jlll iclcllCU Wlu 1 T rxi VV MlOUHIKc 


3.37 


3.13 


R27319 


434831 


Hs.273397 


KIAA0710 aene nroduct 


o oc 

3.35 


3.12 


AA248060 


431842 


Hs.271473 


eoithelial DrOteln LKWiPfiulafoH in rarrJnnma momhnna 

associated protein 17 




3.11 


■kit J 

IMM__0057o 
4 


402328 




Target Exon 




3.1U 




405371 




NM 005569*'Homo saoiens LIM domain kinasp 2 (i \tJ\K0\ 
transcript variant 2a, mRNA. 


3.33 


3 10 
o. 1 u 




} 441650 


Hs.1 32545 


ESTs 




3 OQ 


AlOftiQftO 
Ml ZD 1 gOU 


418629 


Hs.86859 


growth factor receptor-bound protein 7 


3 3 


3 no 




406002 




Target Exon 


3.3 


3.08 




420307 


Hs.66219 


ESTs 


1 00 


*a nn 


AUJCAOOCO 

AVVoUzooy 


425093 


Hs.154525 


KIAA1076 protein 


3 2ft 


3 f!7 
J.Ur 




427351 


Hs.1 23253 


hypothetical protein FLJ22009 


1 Oft 


3.1// 


A\AM n*)CQO 

AW4QZoy3 


417900 


Hs.82906 


CDC20 (cell division cycle 20, S. cerevisiae. homolog) 


3.28 


3.06 


BE250127 


457228 


Hs.1 95471 


Human cosmid CRI-JC2015 at D10S289 in 10sp13 


3.27 


3.05 


U15177 


421026 


Hs.1 01 067 


GCN5 (general control of amlno-acid synthesis, yeast, ho- 
mologHike 2 


3.27 


3.04 


AL047332 


430746 


Hs.406256 


ESTs 


3.27 


3.03 


AW977370 


409556 


Hs.54941 


phosphorylase kinase, alpha 2 (liver) 


3.27 


3.03 


D38616 


451225 


Hs.57655 


ESTs 


3.26 


3.03 


AI433694 
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404913 


- 


NM_024408*:Homo sapiens Notch (Drosophila) homolog 2 
(NOTCH2), mRNA. VERSION NMJ>24410.1 Gl 


3.25 


3.02 


- 


404875 


- 


NMJ)22819*:Homo sapiens phosphoiipase A2 t group IIF 
(PLA2G2F), mRNA. VERSION NMJ)20245.2 Gl 


3.23 


3.02 




404606 




Target Exon 


3.23 


3.01 




414732 


Hs.77152 


minichromosome maintenance deficient (S. cerevisiae) 7 


3.22 


3.01 


AW410976 


425380 


Hs.32148 


AD-015 protein 


3.22 


3.00 


AA356389 


421186 


Hs.270563 


ESTs, Moderately similar to T12512 hypothetical protein 
DKFZp434G232.1 [H.sapiens] 


3.21 


2.98 


AI798039 


445462 


Hs.288649 


hypothetical protein MGC3077 


3.2 


2.97 


AA378776 



Permutatio n analysis of 100 most significantly up-regulated genes in each group 
Bv permuting the sample labels 500 times we estimated the significance of the 
5 differentially expres sed genes. The permutation analysis revealed that it was highly 
unlikely to fi nd as good markers bv chance, as similar aodd markers were only found 
in 5% of the perm uta ted data sets, see Table 2. 

Molecular predictor of progression 

10 A molecular predictor of progression using a combination of genes may have higher predic- 
tion accuracy than when using single marker genes. Therefore, to identify the gene-set that 
gives the best prediction results using the lowest number of genes we built a predictor using 
the "leave one out" cross-validation approach, as previously described (Golub et al. 1999). 
Selecting the 100 best genes in each cross-validation loop gave the lowest number of pre- 

15 diction errors (5 errors, 83% correct classification) in our training set consisting of the 29 
tumors (see Figure 3). As in our previous study we used a maximum likelihood classification 
approach. We selected a gene-expression signature consisting of those 45 genes that were 
present in 75% of the cross-validation loops, and these represent our optimal gene-set for 
progression prediction (Fig. 4a and Table 3). 

20 

Many of these 45 genes were also found among the 200 best markers of progression, how- 
ever, the cross-validation approach also identified other interesting markers of progression 
like BIRC5 (Survivin), an apoptosis inhibitor that is up regulated in the tumors that show later 
progression. BIRC5 has been reported to be expressed in most common cancers (Ambrosini 

25 et al. 1997). To validate the significance of the 45-gene expression signature we used a test 
set consisting of 19 early stage bladder tumors (9 tumors with no progression and 10 tumors 
with later progression). Total RNA from these samples were amplified, labeled and hybrid- 
ized to customized 60mer-oligonucleotide microarray glass slides and the relative expres- 
sions of the 45 classifier genes were measured following appropriate normalization and 

30 background adjustments of the microarray data. The independent tumor samples were clas- 
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sified as non-progressing or progressing according to the degree of correlation to the aver- 
age no progression profile from the training samples (Fig. 3b). When applying no cutoff limits 
to the predictions the predictor identified 74% of the samples correctly. However, as done 
recently in a breast cancer study (van't Veer et al. 2002), we applied correlation cutoff limits 
of 0.1 and -0.1 in order to disregard samples with really low correlation values and in this 
way we obtained 92% correct predictions of samples with correlation values above 0.1 or 
below -0.1. Although the test-set is limited in size the performance is notable and could be of 
clinical use. 



1 0 Table 3. The 45 optimal genes for disease progression prediction. 





■< ;;~ :l 




. _ ^iL^wrrr-.^iJS 




lExernirfagfai 




439010 


Hs.75216 


protein tyrosine phosphatase, receptor 
type, F 


4.57 


4.39 


PTPRF 


AW1 70332 


29 


429124 


Hs.196914 


minor histocompatibility antigen HA-1 


4.20 


4.09 




AW505086 


29 


421649 


Hs.106415 


peroxisome proliferative activated recep- 
tor, delta 


5.76 


5.64 


PPARD 


AA721217 


29 


433914 


Hs.1 12160 


ONA helicase homolog (PIF1) 


3.88 


3.61 


PIF1 


AF108138 


29 


4291 87 


Hs. 163872 


ESTs. Weakly similar to hypothetical 
protein FU20489 


3.49 


3.17 


- 


AA447648 


28 


422765 


Hs.1578 


baculoviral IAP repeat-containing 5 
(survMn) 


2.68 


2.56 


BIRC5 


AW409701 


28 


433844 


Hs. 179647 


ESTs 


4.04 


3.80 




AA610175 


26 


450893 


Hs.25625 


Hypothetical protein FU11323 


3.73 


3.46 


FLJ11323 


AK002185 


25 




nS.ZooUlo 


ESTS 


3.10 


3.02 




R26969 


24 


424909 


Hs. 153752 


cell division cycle 25B 


3.46 


3.16 


CDC25B 


S78187 


24 


452929 


Hs.172816 


neuregulin 1 


4.37 


4.23 


NRG1 


AW954938 


23 


420116 


Hs.95231 


formin homology 2 domain containing 1 


3.90 


3.63 


FHOD1 


NM_013241 


22 


453963 


Hs.28959 


cDNA FU36513 fis. clone 
TRACH2001523 


3.44 


2.88 




AA040311 


29 


429561 


Hs.250646 


baculoviral IAP repeat-containing 6 
(apollon) 


3.83 


3.03 


BIRC6 


AF265555 


29 


418127 


Hs.83532 


membrane cofactor protein (CD46, 
trophoblast-lymphocyte cross-reactive 
antigen) 


4.26 


3.37 


MCP 


BE243982 


29 


422119 


Hs.1 11862 


KIAA0590 gene product 


2.33 


1.95 


KIAA0590 


AI277829 


29 


435521 


Hs.6361 


mitogen-activated protein kinase kinase 
1 interacting protein 1 


5.24 


4.53 


MAP2K1IP1 


W23814 


29 


409632 


Hs.55279 


serine (or cysteine) proteinase inhibitor, 
clade B (ovalbumin), member 5 


4.89 


4.11 


SERPINB5 


W74001 


29 


452829 


Hs.63368 


ESTs 


4.95 


4.31 




AI955579 


29 


416640 


Hs.79404 


DNA segment on chromosome 4 
(unique) 234 expressed sequence 


6.03 


5.51 


D4S234E 


BE262476 


29 


425097 


Hs.154545 


PDZ domain containing guanine nucleo- 
tide exchange factor(GEF)1 


3.77 


3.18 


PDZ-GEF1 


NM_014247 


28 
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445926 


Hs.334826 


splicing factor 3b, subunit 1, 155kDa 


2.40 


2.03 


SF3B1 


AF054284 


28 


437325 


Hs.5548 


F-box and leucine-rich repeat protein 5 


2.48 


2.09 


FBXL5 


AF 142481 


28 


448813 


Hs.22142 


cytochrome b5 reductase b5R.2 


4.28 


3.41 


LOC51700 


AF1 69802 


28 

ftw 


426799 


Hs.303154 


ESTs 


4.86 


4.04 




H 14843 


28 


446847 


Hs.82845 


ESTs 


4.65 


3.79 




T51454 


28 


428016 


Hs. 181 461 


ariadne homolog, ubiquitin-conjugating 
enzyme E2 binding protein, 1 (Droso- 
phila) 


3.77 


3.15 


ARIH1 


AJ243190 


27 


418321 


Hs.84087 


KIAA0143 protein 


4.62 


3.76 


KIAA0143 


D63477 


27 


422984 


Hs.351597 


ESTs 


3.50 


2.93 






ZD 


408688 


Hs. 152925 


KIAA1268 protein 


3,52 


2.95 


KIAA1268 


AI634522 


26 


440357 


Hs.20950 


phosphoiysine phosphohistidine inor- 
ganic pyrophosphate phosphatase 


3.89 


3.07 


LHPP 


AA379353 


26 


420269 


Hs.96264 


alpha thalassemia/mental retardation 
syndrome X-linked (RAD54 (S. cere- 
visiae) homolog) 


3.39 


2.85 


ATRX 


U72937 


26 


423185 


? 


ornithine decarboxylase antizyme 1 


4.61 


3.71 


OAZ1 






443407 


Hs.348514 


clone IMAGE:4052238, mRNA, partial 
cds 


4.21 


3.32 




AA037683 

novw I www 


25 


457329 


Hs.359682 


calpastatin 


3.59 


2.99 


CAST 


Al 634860 


25 


452714 


Hs.30340 


KIAA1 165: likely ortholog of mouse 
Nedd4 WW domain-binding protein 5A 


3.62 


3.01 


KIAA116S 


AW770994 


25 


444773 


Hs.11923 


hypothetical protein DJ167A19.1 


3.71 


3.11 


DJ167A19 1 


BE1 56256 

W W 1 w lift WW 


24 


418504 


Hs.85335 


ESTs 


4.59 


3.67 




BE 1597 18 

U>(— Iwwf IW 


24 


444604 


Hs.11441 


Chromosome 1 open reading frame 8 


4.89 


4.17 


C1orf8 


AW327695 


23 


410691 


Hs.65450 


reticulon 4 










o*i 
£o 


430604 


Hs.247309 


succinate-CoA ligase, GDP-forming, 
beta subunit 


4.61 


3.72 


SUCLG2 


AV650537 


23 


421311 


HS.283609 


muscteblind-like protein MBLL39 


4.65 


3.82 


MBLL39 


N71848 


23 


439632 


Hs.334437 


hypothetical protein MGC4248 


4.29 


3.42 


MGC4248 


AW410714 


22 


417924 


Hs.82932 


cyclin D1 (PRAD1: parathyroid adeno- 
matosis 1) 


4.35 


3.49 


CCND1 


AU077231 


22 


453395 


Hs.377915 


mannosidase, alpha, class 2A, member 
1 


4.71 


3.84 


MAN2A1 


063998 


22 



Permutation analysis of 45 genes 

Again permutation analysis revealed that for all of the 45 genes similar good markers were 
only found in 5% of the 500 permuted datasets (see Table 3). 

Expression profiling of metachrone higher stage tumors 

Expression profiling of the metachrone higher stage tumors could provide important 
information on the degree of expression similarities between the primary and the secondary 
tumors. Tissues from secondary tumors were available from 14 of the patients with disease 
progression and these were also hybridized to the customized Affymetrix GeneChips. 
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Hierarchical cluster analysis of all tumor samples based on the 3,213 most varying probe- 
sets showed that tumors originating from the same patient in 9 of the cases clustered tightly 
together indicating a high degree of intra individual similarity in expression profiles (Fig. 5). 
Notable, one tight clustering pair of tumors was a Ta and a T2+ tumor (patient 941). It was 
remarkable that Ta and T1 tumors and T1 or T2+ tumors from a single individual were more 
similar than e.g. Ta tumors from two individuals. There was no correlation between presence 
and absence of the tight clustering of samples from the same patient and time interval to 
tumor progression. The tight clustering of the 9 tumor pairs probably reflects the monoclonal 
nature of many bladder tumors (Sidransky et aL 1997). A set of genomic abnormalities like 
chromosomal gains and losses characterize bladder tumors of different stages from single 
individuals (Primdahl et al. 2002), and such physical abnormalities could be one of the 
causes of the strong similarity of metachronous tumors. The fact that 5 of the tumor pairs 
clustered apart may be explained by an oligoclonal origin of these tumors. 



Customized GeneChip design, normalization and expression measures 
We used a customized Affymetrix GeneChip (Eos Hu03) designed by Eos Biotech Inc., as 
described (Eaves et al. 2002). Approximately 45,000 mRNA/EST clusters and 6,200 pre- 
dicted exons are represented by the 59,000 probesets on Eos Hu03 array. Data were nor- 
malized using protocols and software developed at Eos Biotechnology, Inc. (WO0079465). 
An "average intensity" (Al) for each probeset was calculated by taking the trimean of probe 
intensities following background subtraction and normalization to a gamma distribution (Tur- 
key 1977). 



cRNA preparation, array hybridization and scanning 

Preparation of cRNA from total RNA and subsequent hybridization and scanning of the cus- 
tomized GeneChip microarrays (Eos Hu03) were performed as described previously 
(Dyrskjot et al. 2003). 



Custom oligonucleotide microarray procedures 

Three 60mer oligonucleotides were designed for each of the 45 genes using Array Designer 
2.0. All steps in the customized oligonucleotide microarray analysis were performed essen- 
tially as described (Kruhoffer et al.) Each of the probes was spotted in duplicates and ail 
hybridisations were carried out twice. The samples were labelled with Cy3 and a common 
reference pool was labelled with Cy5. The reference pool was made by pooling of cRNA 
generated from investigated samples and from universal human RNA. Following scanning of 
the glass slides the fluorescent intensities were quantified and background adjusted using 
SPOT 2.0 (Jain et al. 2002). Data were subsequently normalized using a LOWESS normali- 
sation procedure implemented in the SMA package to R. To select the best oligonucleotide 
probe for each of the 45 genes, 13 of the samples from the training set were re-analysed on 
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the custom oligonucleotide microarray platform and the obtained expression ratios were 
compared to the expression levels from the Affymetrix GeneChips. The oligonucleotide 
probes with the highest correlation to the Affymetrix GeneChip probes were selected. 



Expression data analysis 

Before analysing the expression data from the Eos Hu03 GeneChips control probes were 
removed and only probes with Al levels above 100 in at least 8 experiments and with 
max/min equal to or above 1.6 were selected. This filtering generated a gene-set consisting 
of 6,647 probes for further analysis. Average linkage hierarchical cluster analysis of the tu- 
mour samples was carried out using a modified Pearson correlation as similarity metric 
(Eisen et al. 1998). Genes and arrays were median centered and normalised to the magni- 
tude of 1 before clustering. We used the GeneCluster 2.0 software for the supervised selec- 
tion of markers and for performing permutation tests. The 45 genes for predicting progres- 
sion were selected by t-test statistics and cross-validation performance as previously de- 
scribed (Dyrskjot et al. 2003) and independent samples were classified according to the cor- 
relation to the average no progression signature profile of the 45 genes. 

EXAMPLE 2 

Identifying distinct classes of bladder carcinoma using microarrays 

Patient disease course information - class discovery 

We selected tumours from the entire spectrum of bladder carcinoma for expression profiling 
in order to discover the molecular classes of the disease. The tumours analysed are listed in 
Table 4 below together with the available patient disease course information. 



Table 4 Disease course information of all patients involved- class discovery. 





! 






r 


fa 




§#%■ 1 'i 








I a gr 2(200297) 


Papillary 


Ta gr3 




no 




968-1 




Tagr 2 (011098) 


Papillary 


+ 


Tagr 2 (150101) 


no 




934-1 




Ta gr 2 (220798) 


Papillary 






no 




928-1 




Ta gr 2 (240698) 


Papillary 


+ 




no 




930-1 




Tagr 2 (300698) 


Papillary 


+ 




no 


B 


989-1 




Ta gr 3 (281098) 


Papillary 






no 




1264-1 




Tagr 3 (130600) 


Papillary 




Tagr 2 (231000) 
Tagr 2 (220101) 
Ta gr 2 (300401) 


no 




876-5 


Ta gr 2 (230398) 
Tagr 2 (271098) 
Ta gr 2 (090699) 
Tagr 2 (01 1199) 


Tagr 3 (170400) 


Papillary 


+ 




no 
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la gr ^ \iui£yo/ 
Ta nr 9 (1 <?fiftQ71 
Ta or 1 M 61 2971 
Ta gr 3 (270498) 
Ta gr 2 (220299) 




r'apiiiary 


i a gr2 


Ta gr 2 (120100) 
Ta gr 2 (250500) 

Ta nr O /9CnQnn\ 

i a gr d. (^ouyuu) 

Ta or 2 (0*5020,1 1 
id yi t ^uju^u i ) 


no 




716-2 


Tagr 2 (070397) 


Ta gr 3 (230497) 


Papillary 


+ 


Ta ar 2 (040697) 
Tagrl (170698) 


no 


c 


1070-1 




Ta gr 3 (150399) 


Papillary 




Ta gr 3 (291099) 


Subsequent visit 




956-2 




Ta gr 3 (061299) 


Papillary 


+ 


Ta gr 3 (061200) 


Sampling visit 




1062-2 




Tagr 3 (120799) 


Papillary 


+ 


T1 gr 3 (161199) 


Sampling visit 




1166-1 




Ta gr 3 (271099) 


Papillary 


+ 




Sampling visit 




1330-1 




Tagr 3 (311000) 


Papillary 


+ 




Sampling visit 


D 


112-10 


Ta gr 2 (070794) 
Ta gr 3 (01 1294) 
1 1 gr 3( 19UD90J 
Tagr 3 (121095) 
T1 gr 3(040396) 
Ta or 2 f?OOfiQfi\ 
Tagr 2 (111296) 
Ta gr 2 (230497) 
Ta gr 2 (030997) 


Ta gr 3 (060198) 


Papillary 


+ 


Tagr 3 (110698) 
T1 gr 3 (191098) 
Ta gr 3 (240299) 
T1 gr 3 (050799) 
T1 gr 3 (081199) 
T1 gr 3 (180400) 


Previous visit 




320-7 


T1 gr 3 (011194) 
T1 gr 3 (150896) 

Ta nr MnnAQ7\ 
id yl O \ IUU09i J 


Ta gr 3 (290997) 


Papillary 


+ 


Tagr 3 (290198) 
Ta gr 3 (290698) 


Sampling visit 




747-7 


Tagr 2 (010597) 

Ta nr 2 (23n*5Q71 
Ta or 2 (2309971 
Tagr 2 (260198) 
T1 gr 3 (270498) 
Tagr 2 (170898) 


Tagr 3 (161298) 


Papillary 


+ 


Ta gr 2 (050599) 
Ta gr 2 (280999) 
Ta gr 2 (141299) 


Sampling visit 




967-3 


T1 gr 3 (280998) 
T1 gr 3 (250199) 


Tagr 3 (140699) 


Papillary 


+ 


T1 gr 3 (080999) 


Sampling visit 


E 


625-1 




T1 gr 3 (200996) 


Papillary 






No 




847-1 




T1 ar 3 (2101QA1 


Panllloni 

r^apmary 






No 




1257-1 




T1 ar 3 (2405001 


OUI1U 






Sampling visit 




919-1 




T1 ar 3 (2206981 


i apuiary 






No 




880-1 




T1 ar 3 (3003981 






la gr z (091 198) 
Ta gr 1 (090399) 
i a gr z (050900) 


No 




812-1 




T1 gr 3 (061098) 


Papillary 






No 




1269-1 




T1 gr 3 (230600) 


Papillary 






No 




1083-2 


Tagr 2 (280499) 


T1 gr 3 (120599) 


Papillary 






No 




1238-1 




T1 gr 3 (020500) 


Papillary 


+ 


T2gr3(211100) 
Tagr 2 (211100) 


No 




1065-1 




T1 gr 3 (160399) 


Papillary 






Subsequent visit 




1134-1 




T1 gr 3 (181099) 


Papillary 


T2 gr3 


T1 gr 3 (280200) 


Sampling visit 
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T1 or i rn9n*;nn\ 
T1 or 3 f131100* 




F 


1164-1 




T2+ gr 4 (101299) 


Solid 


gr 3 




Kin 


1032-1 




T2+ gr 7(050199) 


Mixed 






Not measured 


1117-1 




T2+ gr 3 (010999) 


Solid 


+ 




oampung visit 


1178-1 




T2+ gr 3 (200100) 


Solid 


+ 




Not measured 


1078-1 




T2+ gr 3 (120499) 


Solid 


+ 




Not measured 


875-1 




T2+ gr 3 (180398) 


Solid 


+ 




No 


1044-1 




T2+ gr 3 (010299) 


Solid 


+ 


T2+ gr 3 (060999) 


Not measured 


1133-1 




T2+ gr 3 (081099) 


Solid 


+ 




Not measured 


1068-1 




T2+ gr 3 (220399) 


Solid 


+ 




No 


937-1 




T2+ gr 3 (280798) 


Solid 






Not measured 



Group A: Ta gr2 tumours - no recurrence within 2 years. 

Group B: Ta gr3 tumours - no prior T1 tumour and no carcinoma in situ in random biopsies. 
Group C: Ta gr3 tumours - no prior T1 tumour but carcinoma in situ in random biopsies. 
Group D: Ta gr3 tumours - a prior T1 tumour and carcinoma in situ in random biopsies. 
Group E: T1 gr3 tumours - no prior T2+ tumour. Group F: T2+ tumours gr3/4 - only primary 
tumours. 

* Carcinoma in situ detected in selected site biopsies at previous, sampling or subsequent 
visits. 



Two-way hierarchical cluster analysis of tumor samples 

A two-way hierarchical cluster analysis of the tumour samples based on the 1767 gene-set 
(see class discovery using hierachical clustering) remarkably separated all 40 tumours ac- 
cording to conventional pathological stages and grades with only few exceptions (Fig. 6a). 
We identified two main branches containing the superficial Ta tumours, and the invasive T1 
and T2+ tumours. In the superficial branch two sub-clusters of tumours could be identified, 
one holding 8 tumours that had frequent recurrences and one holding 3 out of the five Ta 
grade 2 tumours with no recurrences. In the invasive branch, it was notable that four Ta 
grade 3 tumours clustered tightly with the muscle invasive T2+ tumours. These four Ta tu- 
mours, from patients with no previous tumour history, showed concomitant CIS in the sur- 
rounding mucosa, indicating that this sub-fraction of Ta tumours has some of the more ag- 
gressive features found in muscle invasive tumours. The stage T1 cluster could be sepa- 
rated into three sub-clusters with no clear clinical difference. The one stage T1 grade 3 tu- 
mour that clustered with the stage T2+ muscle invasive tumours was the only T1 tumour that 
showed a solid growth pattern, all others showing papillary growth. Nine out of ten T2+ tu- 
mours were found in one single cluster. The remarkable distinct separation of the tumour 
groups according to stage, with practically no overlap between groups, was also demon- 
strated by multidimensional scaling analysis (Fig. 6c). 
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In an attempt to reduce the number of genes needed for class prediction we identified those 
genes that were scored by the Cancer Genome Anatomy Project (at NCI) as belonging to 
cancer-related groups such as tumour suppressors, oncogenes, cell cycle, etc. These genes 
were then selected from the initial 1767 gene-set, and those 88 which showed largest varia- 
5 tion (SD of the gene vector >=4), were used for hierarchical clustering of the tumour sam- 
ples. The obtained clusters was almost identical to the 1767 gene-set cluster dendrogram 
(Fig. 6b), indicating that the tumour clustering does not simply reflect larger amounts of 
stromal components in the invasive tumour biopsies. 

10 The clustering of the 1767 genes revealed several characteristic profiles in which there was 
a distinct difference between the tumour groups (Fig. 6d; black lines identifying clusters a to 
j). 

Cluster a, shows a high expression level in all the Ta grade 3 tumours (Fig. 7a) and, as a 
novel finding, contains genes encoding 8 transcription factors as well as other nuclear genes 
related to transcriptional activity. Cluster c contains genes that are up-regulated in both Ta 
grade 3 with high recurrence rate and CIS, in T2+ and some T1 tumours. This cluster shows 
a remarkable tight co-regulation of genes related to cell cycle control and mitosis (Fig. 7c). 
Genes encoding cyclins, PCNA as well as a number of centromere related proteins are pre- 
sent in this cluster. They indicate increased cellular proliferation and may form new targets 
for small molecule therapy (Seymour 1999). Cluster f shows a tight cluster of genes related 
to keratinisation (Fig. 7f). Two tumours (875-1 and 1178-1) had a very high expression of 
these genes and a re-evaluation of the pathology slides revealed that these were the only 
two samples to show squamous metaplasia. Thus, activation of this cluster of genes pro- 
motes the squamous metaplasia not infrequently seen by light microscopy in invasive blad- 
der tumours. The genes in this cluster is listed in Table 5. 



Table 5 Genes for classifying samples with squamous metaplasia 



Chip acc. # 


UntGene Build 162 


description 


D83657_at 


Hs.19413 


NMJK)5621; S100 calcium-binding protein A12 


HG3945-HT4215_at 






J00124_at 






L05187_at 






L05188J_at 


Hs.505327 




L10343_at 


Hs.1 12341 


NMJ)02638; skin-derived protease inhibitor 3 preproprotein 


L42583_f_at 


Hs.367762 


NM_005554; keratin 6A 


L42601_f_at 


Hs.367762 


NM_005554; keratin 6A 


L42611_f_at 


HS.446417 


NIVM 73086; keratin 6 isoform K6e 


M19888_at 


Hs.1076 


NMJ)03125; small proline-rlch protein 1B (cornifin) 


M20030_f_at 


Hs.505352 




M21005_at " 
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M21302_at 


Hs.505327 




M21539_at 


Hs.2421 


NM_006518; small proline-iich protein ic 


M86757_s_at 


Hs.1 12408 


NM_002963; S100 calcium-binding protein A7 


S72493_s_at 


Hs.432448 


NM_005557; keratin 16 


U70981_at 


Hs.336046 


NM_000640; interteukin 13 receptor, alpha 2 precursor 


V01516J_at 


Hs.367762 


NM_005554; keratin 6A 


X53065_f_at 






X57766_at 


Hs.143751 


NM_005940; matrix metalloproteinase 11 preproprotein 


Z19574_rna1_at 







Cluster g contains genes that are up-regulated In T2+ tumours and in the Ta grade 3 tu- 
mours with CIS that cluster in the invasive branch (Fig. 7g). This cluster contains genes re- 
lated to angiogenesis and connective tissue such as laminin, myosin, caldesmon, collagen, 
dystrophin, fibronectin, and endoglin. The increased transcription of these genes may indi- 
cate a profound remodelling of the stroma that could reflect signalling from the tumour cells, 
from infiltrating lymphocytes, or both. Some of these may also form new drug targets (Fox et 
al. 2001). It is remarkable that these genes are those that most clearly separate the Ta grade 
3 tumours surrounded by CIS from all other Ta grade 3 tumours. The presence of adjacent 
CIS is usually diagnosed by taking a set of eight biopsies from different places in the bladder 
mucosa. However, the present data clearly indicate that analysis of stroma remodelling 
genes in the Ta tumours could eliminate this invasive procedure. 

The clusters b, d, e, h, i, and j contain genes related to nuclear proteins, cell adhesion, 
growth factors, stromal proteins, immune system, and proteases, respectively (see Figure 8). 
A summary of the stage related gene expression is shown in Table 6. 



Table 6 



Table 6* 


Summary of stage related gene expression 




Functional gene clusters 9 


Tumour stage 


Transcription Nuclear Proliferation 


Matrix re- 


Extracellular 


Immune 






processes 


modelling 


matrix 


system 


Tagr2 


T 






u 


i 


Tagr3 


ttt 


tt tt 




U 


4 


T1 gr3 


f 


Tf 




I 


r 


T2gr3 


t 


ttt 


ttt 


t 


t 


Ta gr3 + CIS 


Ttt 


tt ttt 


ttt 


t 


t 



a For a detailed description of gene clusters see Fig. 8. 



b An increase in gene expression was only found in about half of the samples analysed. 
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Class prediction of bladder tumours 

An objective class prediction of bladder tumours based on a limited gene-set is clinically 
5 usefull. We therefore built a classifier using tumours correctly separated in the three main 
groups as identified in the cluster dendrogram (Fig. 6a). We used a maximum likelihood 
classification method with a "leave one out" cross-validation scheme (Shipp et al. 2002; van't 
Veer et al. 2002) in which one test tumour was removed from the set, and a set of predictive 
genes was selected from the remaining tumour samples for classifying the test tumour. This 
process was repeated for all tumours. Predictive genes that showed the largest possible 
separation of the three groups were selected for classification, and each tumour was classi- 
fied according to how close it was to the mean of the three groups (Fig. 8a). 

Classification of samples 

From the hierarchical cluster analysis of the samples (class discovery) we identified three 
major "molecular classes" of bladder carcinoma highly associated with the pathologic staging 
of the samples. Based on this finding we decided to build a molecular classifier that assigns 
tumours to these three "molecular classes". To build the classifier, we only used the tumours 
in which there was a correlation between the "molecular class" and the associated pathologic 
stage. Consequently, a T1 tumour clustering in the "molecular class" of T2 tumours was not 
used to build the classifier. 

The genes used in the classifier were those genes with the highest values of the ratio (B/W) 
of the variation between the groups to the variation within the groups. High values of the ratio 
(B/W) signify genes with good group separation performance. We calculated the sum over 
the genes of the squared distance from the sample value to the group mean and classified 
the sample as belonging to the group where the distance to the group mean was smallest. If 
the relative difference between the distance to the closest and the second closest group 
compared to the distance to the closest group were below 5%, the classification failed and 
the sample was classified as belonging to both groups. The relative difference is refered to 
as the classifier strength. 

Classifier performance 

The classifier performance was tested using from 1-160 genes in cross-validation loops. 
Figure 9 shows that the closest correlation to histopathology is obtained in the cross- 
validation model using from 69-97 genes. Based on this we chose the model using 80 genes 
for cross-validation as our final classifier model. 

Classifier model using 71 genes 



25 



30 
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We selected those genes for our final classifier model that were used in at least 75% (25 
times) of the cross-validation loops. These 71 genes are listed in table 7. 

Table 7 Feature: Accession number on HuGene fl array. Number: Number of times used in 
the 80 genes cross validation loops. Test (B/W): see below. 



AF000231_at 




- Gcga^atei) ...... , k . 




UQ33;|«,. . • 




ns. fsoio 


nm_uu40w; Kas-reiated protein Rab-1 1 A 


33 


26.77 


D13666_s_at 


Hs. 136348 


NM_006475; osteoblast specific factor 2 (fasciclin Mike) 


33 


27.71 


D49372_s_at 


Hs.54460 


NMJ502986; small inducible cytokine A1 1 precursor 


31 


25.78 


D83920_at ' 


Hs.440898 


NM_002003; ficoiin 1 precursor 


33 


31.18 


D86479_at 


Hs.439463 


NMJ)01129; adipocyte enhancer binding protein 1 precursor 


33 


28.29 


D89077_at 


Hs.75367 


NM_006748; Src-like-adaptor 


33 


30.03 


D89377_at " 


Hs.89404 


NM_002449; msh homeo box homolog 2 


33 


51.50 


HG4069-HT4339_s_at 






27 


25.06 


HG67-HT67_f_at 






33 


27.81 


HG907-HT907_at 






33 


25.76 


J02871_s_at 


Hs.436317 


NM_000779; cytochrome P450, family 4. subfamily B, poly- 
peptide 1 


33 


32.61 


J03278_at 


Hs.307783 


NM_002609; platelet-derived growth factor receptor beta 
precursor 


33 


28.02 


J04058_at 


Hs. 16991 9 


NM_000126; electron transfer flavoprotein, alpha polypep- 
tide 


33 


29.46 


J05032_at 


Hs.32393 


NM_001349; aspartyl-tRNA synthetase 


33 


38.21 


J05070_at 


Hs. 151 738 


NMJ504994; matrix metalloproteinase 9 preproprotein 


33 


35.34 


J05448_at 


Hs.79402 


NM_002694; DNA directed RNA polymerase II polypeptide 
C NM_032940; DNA directed RNA polymerase II polypep- 
tide C 


32 


26.51 


L/A4 OOC Hi 

l\U1 «jyo_at 


Hs.297681 


i^i»i_wv/ V ^.57o, aenne ^ur cysteine ) proteinase inhibitor, clade 
A (alpha-1 antiproteinase. antitrypsin), member 1 


33 


28.66 


L13720_at 


Hs.437710 


NMJ500820; growth arrest-specific 6 " 


33 


29.69 


M12125_at 


Hs.300772 


NM_003289; tropomyosin 2 (beta) 


28 


24.89 


M15395_at 


Hs.375957 


NMJJ00211; mtegrln beta chain, beta 2 precursor 


33 


29.40 


M16591__s_at 


Hs.89555 


NM_0021 10; hemopoietic cell kinase isoform p61HCK 


33 


32.34 


M20530_at 






33 


30.28 


M23178_s_at 


Hs.73817 


NMJ)02983; chemoklne (C-C motiO Hgand 3 


33 


35.36 


M32011_at 


Hs.949 


NM J)00433; neutrophil cytosolic factor 2 ~ 


33 


41.88 


M33195_at 


Hs.433300 


NM_004106; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


33 


30.40 


M55998_s_at 


Hs.172928 


NM_000088; alpha 1 type I collagen preproprotein 


33 


26.83 


M57731_s_at 


Hs.75765 


NM_002089; chemokine (C-X-C motif) ligand 2 


33 


31.84 


M68840_at " 


Hs.183109 


NM_000240; monoamine oxidase A " 


33 


32.39 


M69203_s_at 


Hs.75703 


NMJ)02984; chemokine (C-C motiO ligand 4 precursor 


33 


36.21 


M72885_ma 1 _s_at 






33 


27.94 
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M83822_at 


Hs.209846 


MM 006726* 1 PR-rocnnn«ivA vacIcIa trafficklnn Koarh and 

anchor containing 


33 


26.44 


S77393_at 


Hs 145754 

1 I0i • W f **^**r 


NM 016531* Krunoel-like factor 3 fhasirl 


**fi 


49.85 


U01833_at 


Hs.81469 


NM 002484* nucleotide hindinn nrotein 1 /Minn hnmnlnn P 

coli) 


33 


30.62 


U07231_at 


Hs.309763 


NM 002092* G-rich RNA seauence hindino factor 1 


oo 


OO 4 A 

o9. 10 


U09937_rna1_s_at 






***** 
oo 


30.00 


U10550_at 


Hs.79022 


NM_005261; GTP-binding mitogen-induced T-cell protein 
NM_181702; GTP-binding mitogen-induced T-cell protein 


28 


25.26 


U20158_at 


Hs.2488 


NM 005565* IvmnhocvtA cvtncolir* nrotoin 0 


oo 


32.41 


U41315_ma1_s_at 






oo 


43.56 


U47414_at 


Hs. 13291 


NMJ)04354; cyciin G2 


33 


44.42 


U49352_at 


Hs.414754 


NM 0013*5Q* 2 4-diPnnv/1 fVlA roHnrtaco 1 nramirenr 

iiivi_vvi«jj9, fc,*T*vjioi iuyi oom itiuuciobo i precursor 


OO 


37.04 


U50708_at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase El, 
uoid ^uiyfjypuutj precursor rMivi_i oouou, orancnea cnam 
keto acid dehydrogenase E1, beta polypeptide precursor 


33 


42.89 


U52101_at 


Hs 9999 


NM nm 49 *V Anithatia! momKrana nrntaln 1 

i^iivi__w • *T^»j t epiuiuiiai rnernurane proiein o 


33 


29.86 


U64520_at 


Hs.66708 


inivi^ ywH/ o i , vtjbiue-assoaaiea memDrane protein o (ceilu- 
brevin) 


33 


30.17 


U65093__at 


Hs.82071 


NM 006079* Chn/n^OO-lntAractinn francar>tiw9tar unfK 

Gfu/Asp-rich carboxy-terminal domain, 2 


33 


32.07 


U68019_at 


Hs.288261 


nm^vwjw^, iviml^, iiiyuiBib against oecapeniapiegic no- 
molog 3 


31 


26.70 


U68385_at 


Hs.380923 




33 


31.56 


U74324_at 


Hs.90875 


NM 002871* RAB-intAracti no factor 


33 


30.26 


U77970_at 


Hs.321164 


NM 002518* neuronal PAS domain nrntain *r> mm n^ooic- 
■ ini^wfctiiw! i icuiui iai i /^o viwiiiciin protein ^ FMIVI UOZ^OO, 


33 


50.37 


U90549_at 


Hs.236774 


NM 006353' hiah mohilitv nronn ni iHoncnmai _ j_ 

main 4 


33 


32.16 


X04085_ma1_at 






28 


25.13 


X07743_at 


Hs.77436 


NM_002664; pleckstrin 


33 


28.13 


X13334_at 


Hs. 75627 


i>iivi_uuui/3 i, uu i*f dnuytjn precursor 


33 


35.79 


X14046_at 


Hs. 153053 


NM 001774* f*n*^7 antinan 
iiivi^uu lift, wUO / dnuycn 


30 


24.70 


X15880_at 


Hs 415997 


i^ivi_uu 1 0+0, cauagen, type vi t aipna i precursor 


33 


31.51 


X15882_at 


Hs.420269 


inivi_iaj io**y, aipna & cype vi couagen isoform 2C2 precursor 

NM 0581 74* aloha 2 tvne \l\ cnllanon lenfAm or">o 
»^«»i_v»»*w 1 1 , di^na ^ ty^jo vi cuiiayen isoiorm «oza precur- 
sor NM 058175' alDha 2 tVOe VI OOllaOAO knfnrm OC9a 

precursor 


33 


32.32 


X51408_at 


Hs.380138 


NM_001822; chimerin (chimaerin) 1 


33 


30.51 


X53800_s_at 


Hs.89690 


NM 002090* chemokine /n-X.r motifs HnanH i 


33 


33.63 


X54489_rna1_at 






33 


33.57 


X57579_s_at 






33 


41.43 


X64072_s_at 


Hs.375957 


NMJXJ021 1; integrin beta chain, beta 2 precursor 


33 


43.21 


X67491_f_at 


Hs.355697 


NM_005271 ; glutamate dehydrogenase 1 


33 


30.97 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein Isoform a 
NM_182715; synaptophysin-like protein isoform b 


33 


46.53 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


33 


53.16 


X78520_at 


Hs.372528 


NM_001829; chloride channel 3 


33 


47.38 
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Y00787_s_at 


Hs.624 


NM_000584; inteiieukin 8 precursor 


32 


27.54 




MS. 334534 


NM_002076; glucosamine (N-acetyl)-6-sulfatase precursor 


30 


25.44 


Z19554_s_at 


Hs.435800 


NM_003380; vfmentin 


27 


24.59 


Z26491_s_at 


HS.240013 


NM_000754; catechol-O-rnethyl transferase isoform MB- 
COMT NM__007310; catechol-O-rnethyl transferase Isoform 
S-COMT 


32 


26.92 


229331_at 


Hs.372758 


NVM 82697; ubiquitin-conjugating enzyme E2H isoform 2 


33 


33.49 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 
NM_1 76865; NM_1 76866; inorganic pyrophosphatase 2 
isoform 3 NM_1 76867; inorganic pyrophosphatase 2 isoform 
4 NIVM76869; inorganic pyrophosphatase 2 isoform 1 


33 


44.45 


Z74615_at 


Hs.1 72928 


NM_000088; alpha 1 type I collagen preproprotein 


33 


55.18 



Test for significance of classifier 

To test the class separation performance of the 71 selected genes we compared the B/W 
ratios with the similar ratios of ail the genes calculated from permutations of the arrays. For 
each permutation we construct three pseudogroups, pseudo-Ta, pseudo-T1 , and pseudo-T2, 
so that the proportion of samples from the three original groups is approximately the same in 
the three pseudogroups. We then calculate the ratio of the variation between the 
psudogroups to the variation within the pseudogroups for all the genes. For 500 
permutations we only two times had one gene for which the B/W value was higher than the 
lowest value for the original B/W values of the 71 selected genes (the two values being 
25.28 and 25.93). 

The classifier performance was tested using from 1-160 genes in cross-validation loops, and 
a model using an 80 gene cross-validation scheme showed the best correlation to pathologic 
staging (p<10~ 9 ). The 71 genes that were used in at least 75% of the cross validation loops 
were selected to constitute our final classifier model. See the expression profiles of the 71 
genes in Figure 10. The genes are clustered to obtain a better overview of similar expression 
patterns. From this it is obvious that the T1 stage is characterised by having expression pat- 
terns in common with either Ta or T2 tumours. There are no single genes that can be used 
as a T1 marker. 

Permutation analysis 

To test the class separation performance of the 71 selected genes we compared their per- 
formance to those of a permutated set of pseudo-Ta, T1 and T2 tumours. In 500 permuta- 
tions we only detected two genes with a performance equal to the poorest performing classi- 
fying genes. 
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Classification using 80 predictive genes and other gene-sets 

The classification using 80 predictive genes in cross-validation loops identified the Ta group 
with no surrounding CIS and no previous tumor or no previous tumor of a higher stage (Ta- 
ble 8). Interestingly, the Ta tumours surrounded by CIS that were classified as T2 or T1 
clearly demonstrate the potential of the classification method for identifying surrounding CIS 
in a non-invasive way, thereby supplementing clinical and pathologic information. 



Table 8 



Table 8 • Clinical data on disease courses and results of molecular classification 



Tumours" 




T1 grade 3 ^| 



:T2 grade- 3 .., j^l 

-<^** v \s£?. 

-yvr*- ^i^r .--'^ v - v ; < * :. ■ 



Patient 


Previous 


Tumour 


Subsequent 


Carcinoma 


Reviewed 


Molecular classifier" 




tumours 


analysed 


tumours 


in sit J* 


histology 0 


320 


80 


20 


Ta grade 


II tumours- 


no progression 














709-1 




Tagr2 




No 


Ta gr3 


Ta 


Ta 


Ta 


968-1 




Tagr2 


1Ta 


No 




Tartl 


Ta 


Ta 


934-1 




Tagr2 




No 




T1 


Ta 


Ta 


928-1 




Ta gr2 




No 




Ta 


Ta 


T1 


930-1 




Ta gr2 




No 




Ta 


Ta 


Ta 


Ta grade III tumours - 


no prior T1 tumour or CIS 












989-1 




Tagr3 




No 




Ta 


Ta 


Ta 


1264-1 




Ta gr3 


3Ta 


No 




Ta 


Ta 


Ta 


876-5 


4Ta 


Ta gr3 




No 




Ta 


Ta 


Ta 


669-7 


5Ta 


Ta gr3 


4Ta 


No 


Ta gr2 


Ta 


Ta 


Ta 


716-2 


iTa 


Ta gr3 


2Ta 


No 




Ta 


Ta 


Ta 


Ta grade III tumours - 


no prior T1 tumour but CIS in selected site biopsies 










1070-1 




Tagr3 


1Ta 


Subsequent visit 




Ta 


Ta 


Ta 


956-2 




Ta gr3 


ITa 


Sampling visit 




T2 


T2 


T2/T1 


1062-2 




Ta gr3 


1 T1 


Sampling visit 




T2/Ta 


T1/Ta 


Ta 


1166-1 




Tagr3 




Sampling visit 




Tarn 


Ta 


Ta 


1330-1 




Tagr3 




Sampling visit 




T2 


T2 


Ta 


Ta grade III tumours - 


a prior T1 tumour and CIS in selected site biopsies 










747-7 


5Ta, 1T1 


Tagr3 


3Ta 


Sampling visit 




Ta 


Ta 


Ta 


112-10 


7Ta. 2T1 


Tagr3 


2Ta,4Tl 


Previous visit 




Ta 


Ta 


Ta 


320-7 


1 Ta,2T1 


Tagr3 


2Ta 


Sampling visit 




T2 


T2 


Ta 


967-3 


2T1 


Tagr3 


1T1 


Sampling visit 




Ta 


Ta 


Ta 


T1 grade III tumours - 


no prior muscle invasive tumour 












625-1 




T1 gr3 




No 




T1 


T1 


T1 


847-1 




T1 gr3 




No 




T1 


T1 


T1 


1257-1 




T1 gr3 




Sampling visit 




T1 


T1 


T1 


919-1 




T1 gr3 




No 




T1 


T1 


T1 


880-1 




T1 gr3 


4Ta 


No 




T1 


T1 


T1 


812-1 




T1 gr3 




No 




T1 


T1 


T1 


1269-1 




T1 gr3 




No 


No review 


T1 


T1 


T1 


1083-2 


ITa 


T1 gr3 




No 


No review 


T1 


T1 


T1 


1238-1 




T1 gr3 


1 Ta. 1 T2+ 


No 




T1 


T1 


T1 


1085-1 




T1 gr3 




Subsequent visit 


No review 


T1 


T1 


T1 


1134-1 




T1 gr3 


3T1 


Sampling visit 


T2 gr3 


T1 


T1 


T1 


T2+ graae iii/iv tumours - only primary tumours 












1164-1 




T2* gr4 




No 


T2+ gr3 


T2/T1 


T1 


T1 


1032-1 




T2+ gr? 




NO 


No review 


T2 


T2 


T2 


1117-1 




T2+gr3 




ND 




T2 


T2 


T1 


1178-1 




T2+ gr3 




ND 




T2 


T2 


T2 
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1078-1 


T2+ gr3 


ND 




T2 


T2 


T2 


875-1 


T2+gr3 


No 




T2 


T2 


T2 


1044-1 


T2+ gr3 


1 T2* ND 




T2 


T2 


T2 


1133-1 


T2+ gr3 


ND 




T2 


T2 


T2 


1068-1 


T2+ gr3 


No 




T2 


T2 


T2 


937-1 


T2+gr3 


ND 


No review 


T1 


Tl 


T1 



Examples of tumour histology. 
b Carcinoma in situ detected in selected site biopsies at the time of sampling tumour tissue 
for the arrays or at previous or subsequent visits. 

c All tumours were reviewed by a single uro-pathologist and any change compared to the 
routine classification is listed. 

d Molecular classification based on 320, 80, and 20 genes cross-validation loops. 



Classification using other gene-sets 

Classification was also carried out using other gene-sets (10, 20, 32, 40, 80, 160, and 320 
10 genes). These gene-sets demonstrated the same classification tendency as the 71 genes. 
See Tables 9 - 1 5 for gene-sets. 



Table 9. 320 genes for classifier 



Chip acc. # 



UniGene Build 162 



description 



AB000220_at 



Hs.171921 



NMJ)06379; sema- 
phore 3C 



15 



Chip acc. # 


UniGene Build 162 


description 


AB000220_at 


Hs.171921 


NM J>06379; sema- 
phore 3C 


AC002073_cds1_at 






AF000231_at 


Hs.75618 


NM_0Q4663; Ras- 
related protein Rab-11A 


D10922_s_at 


Hs.99855 


NM_001462; formyl 
peptide receptoMike 1 


D10925_at 


Hs.301921 


NMJX)1295; 
chemokine (C-C motif) 
receptor 1 


D11086_at 


Hs.84 


NM_000206; interleukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NM_001957; endothelin 
receptor type A 


D13435_at 


Hs.426142 


NM_002643; phos- 
phatidyf inositol glycan. 
class F isoform 1 
NM_1 73074; phos- 
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phatldytinositol glycan, 
class F isoform 2 


D13666_s_at 


Hs. 136348 


NM_006475; osteoblast 
specific factor 2 (fasci- 
din Mike) 


D14520_at 


Hs.84728 


NMJXM730; Kruppel- 
llke factor 5 


D21878_at 


HS.169998 


NMJ)04334; bone 
marrow stromal cell 
antigen 1 precursor 


D26443_at 


Hs.371369 


NM_004172; solute 
carrier family 1 (glial 
high affinity glutamate 
transporter), member 3 


D28589_at 


Hs.17719 




D42046__at 


Hs.194665 




D45370_at 


Hs.74120 


NM_006829; adipose 
specific 2 


D49372_s_at 


Hs.54460 


NM_002986; small 
inducible cytokine A1 1 
precursor 


D50495_at 


Hs.224397 


NM_003195; transcrip- 
tion elongation factor A 
(Sit), 2 


D63135_at 


Hs.27935 


NM_032646; tweety 
homolog 2 


D64053_at 


Hs.198288 


NMJ)G2849; protein 
tyrosine phosphatase, 
receptor type, R isoform 
1 precursor 
NNM 30846; protein 
tyrosine phosphatase, 
receptor type, R isoform 
2 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 
precursor 


D85131_s_at 


Hs.433881 


NM_002383; MYC- 
associated zinc finger 
protein 


D86062_s__at 


Hs.413482 


NMJ304649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM_001129; adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs.105751 


NMJ)14720; Ste20- 
related serine/threonine 
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kinase 


D86976_at 


Hs. 19691 4 




D87433__at 


Hs. 30 1989 


NM.015136; stabilin 1 


D87443_at 


Hs.409862 


NM_014758; sorting 
nexin 19 


D87682_at 


Hs.134792 




D89077_at 


Hs.75367 


NM_006748; Src-like- 
adaptor 


D89377_at 


Hs.89404 


NM_002449; msh 
homeo box homolog 2 


D90279_s_at 


Hs.433695 


NM_000093; alpha 1 
type V collagen prepro- 
protein 


HG1996-HT2044_at 






HG2090-HT2152_s_at 






HG2463-HT2559_at 






HG2994-HT4850 s_at 






HG3044-HT3742_s_at 






HG3187-HT3366_s_at 






HG3342-HT351 9_s_at 






HG371-HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67J_at 






HG907-HT907_at 






J02871_s_at 


Hs.436317 


NMJ)00779; cyto- 
chrome P450, family 4, 
subfamily B, polypep- 
tide 1 


J03040_at 


Hs.111779 j 


NM_003118; secreted 
protein, acidic, cysteine* 
rich (osteonectin) 


J0306Q_at 






J03068_at 






J03241_s_at 


Hs.2025 


NM_003239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NM_002609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925_at 


Hs.172631 


NM_000632; integrin 
alpha M precursor 


J04056at 


Hs.88778 


NM_001757; carbonyl 
reductase 1 


J04058_at 


Hs.169919 


NM_000126; electron 
transfer flavoprotein, 
alpha polypeptide 


JQ4093_s_at 


Hs.278896 


NM_019075; UDP 
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glycosyl transferase 1 
family, polypeptide A10 


J04130_s_at 


Hs.75703 


NM_002984; 
chemokine (C-C motif) 
(igand 4 precursor 


J04152_rna1_s_at 






J04162_at 


Hs.372679 


NM_000569; Fc frag- 
ment of IgG, low affinity 
Ilia, receptor for (CD16) 


J04456_at 


Hs.407909 


NM_002305; beta- 
galactosidase binding 
lectin precursor 


J05032_at 


Hs.32393 


NM 001349; aspartyl- 
tRNA synthetase 


J05036,s_at 


Hs.1355 


NMJ)01910; cathepsin 
E isoform a oreoroDro- ■ 
tein NNM48964; ca- 
thepsin E isoform b 
preproprotein 


J05070_at 


Hs. 151 738 


NM_004994; matrix 
metaiioproteinase 9 
preproprotei n 


J05448_at 


Hs.79402 


NM_002694; DNA 
directed RNA poly- 
merase II polypeptide C 
NM_032940; DNA 
directed RNA doIv- 
merase II oolvneotide C 


K01396_at 


Hs.297681 


NM_000295; serine (or 
cysteine) proteinase 
inhibitor, clade A (al- 
pha-1 antiproteinase, 
antitrypsin), member 1 


K03430_at 






L06797_s_at 


Hs.421986 


NM.003467; 
chemokine (C-X-C 
motif) receptor 4 


L10343_at 


Hs.1 12341 


NM_002638; skin- 
derived protease inhibi- 
tor 3 DreorODrotein 


L11708__at 


Hs. 1551 09 


NM_002153; hydroxys- 
teroid (17-beta) dehy- 
drogenase 2 


L13391_at 


Hs.78944 


NM_002923; regulator 
of G-protein signalling 
2. 24kDa 


L13698_at 


Hs.65029 


NMJX)2048; growth 
arrest-specific 1 
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L1 3720_at 


Hs.437710 


NM_000820; growth 
arrest-specific 6 


L13923_at 


Hs.750 


NMJ)001 38; fibrillin 1 | 


AB000220_at 


Hs.171921 


NM_006379; sema- ! 
phorin 3C 


AC002073_cds1_at 






AF000231_at 


Hs.75618 


NM_004663; Ras- | 
related protein Rab-11 A [ 


L)10922_s_at 


Hs.99855 


NM_001462; formyl j 
peptide receptor-like 1 I 


D10925_at 


Hs.301921 


NMJ)01295; 
chemokine (C-C motif) 
receptor 1 j 


D11086_at 


Hs.84 


NM_000206; interleukin 
2 receptor, gamma 
chain, precursor j 


D11151_at 


Hs.211202 


NM_001957;endothelin 
receptor type A j 


D13435_at 


Hs.426142 


NrVL002643; phos- 
phatidylinositol glycan, 
class F isoform 1 
NMJ 73074; phos- j 
phatidylinositol glycan, 
class F isoform 2 j 


D13666_s_at " 


Hs.136348 


NMJ)06475; osteoblast 
specific factor 2 (fasci- j 
din Mike) 


D14520_at " ' 


Hs.84728 " 


NMJ)01730; Kaippel- 
like factor 5 


D21878_at 


Hs.1 69998 


NM_004334; bone 
marrow stromal cell 
antigen 1 precursor I 


U^o443_Jlt 


Hs.371369 


NM_0Q41 72; solute 
carrier family 1 (glial 
high affinity glutamate 
transporter), member 3 | 


D28589_at 


Hs.17719 






Hs. 194665 " 




D45370_at 


Hs.74120 " 


NM_006829; adipose 
specific 2 I 


D49372_s__at 


Hs.54460 


NM_002986; small 
Inducible cytokine A11 
precursor j 


D50495_at 


Hs.224397 


NM_0031 95; transcrip- j 
tion elongation factor A 
(Sll), 2 j 


D63135_at 


Hs.27935 


imm_U32646; tweety | 
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homolog 2 


D64053_at 


Hs.198288 


NM_002849; protein 

receptor type, R isoform 

1 Drecursor 

NM_1 30846; protein 

tyrosine phosphatase, 

receDtor tvoe R isoform 

2 


D83920_at 


Hs.440898 


NMJW2003; ficolin 1 
Drecursor 


D85131_s_at 


Hs.433881 


NMJ)02383; MYC- 
associated zinc finger 
protein 


D86062_s_at 


Hs. 41 3482 


NM 004649* chroma- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM 001129* adinncvta 

enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959„at 


Hs.1 05751 


NM__0 14720; Ste20- 
related serine/threonine 
kinase 


D86976_at 


Hs. 196914 




D87433_at 


Hs.301989 


NM_015136; stabilin 1 


D87443_at 


Hs.409862 


NM 014758* sortino 
nexin 19 


D87682_at 


Hs.1 34792 




D89077_at 


Hs.75367 


NM 006748- Srr.-lik**- 

adaptor j 


D89377_at 


Hs.89404 


NM_002449; msh 
homeo box homolog 2 


D90279_s_at 


Hs.433695 


NM_000093; alpha 1 
type V collagen prepro- 
protein 


HG1996-HT2044_at 






HG2090-HT21 52_s_at 






HG2463-HT2559_at 






HG2994-HT4850_s_at 






HG3044-HT3742_s_at 






HG3187-HT3366_s_at 






HG3342-HT3519_s_at 






HG371-HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67J_at 






HG907-HT907_at 
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J02871_s_at 


Hs.436317 


NM 000779' cvto- 
chrome P450, family 4, 
subfamily B, polypep- 
tide 1 


J03040_at 


Hs.1 11779 


NM_003118; secreted 
protein, acidic, cysteine- 
rich (osteonectin) 


J03060_at 






J03068_at 






J03241_s_at 


Hs.2025 


NM_003239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NM_002609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925_at 


Hs.172631 


NM 000632; integrin 
alpha M precursor 


J04056_at 


Hs.88778 


NMJJ01757; carbonyl 
reductase 1 


J04058at 


Hs. 16991 9 


NMJJ00126; electron 
transfer flavoprotein, 
alDha DOlvoeDtide 


J04093_s_at 


Hs.278896 


NM_019075; UDP 
glycosyl transferase 1 
family, polypeptide A10 


J04130_s_at 


Hs.75703 


NM 002984- ""I 
chemokine (C-C motif) 
ligand 4 precursor 


J04152_rna1_s_at 






J04162_at 


Hs.372679 


NM_000569; Fc frag- 
ment of loG low affinltv 
Ilia, receptor for (CD16) 


J04456_at 


Hs.407909 


NM_002305; beta- 
galactosidase binding 
lectin precursor 


J05032_at 


Hs.32393 


NM 001343' aenar+vl- 

pii»i_UU l«Jn9| dopdl ly|- 

tRNA synthetase 


J05036_s_at 


Hs.1355 


NM_001910; cathepsin 
c iouiuiui & prepropro* 
teln NM_1 48964; ca- 
thepsin E isoform b 
preproprotein 


J05070_at 


Hs. 151 738 


NM__004994; matrix 
metailoproteinase 9 
preproprotein 


J05448_at 


Hs.79402 


NM_002694; DNA 
directed RNA poly- 
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merase II polypeptide C 
NMJ)32940; DNA 
directed RNA poly- 
merase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or 
cysteine) proteinase 
inhibitor, ciade A (al- 
pha- 1 antiproteinase, 
antitrypsin), member 1 


K03430_at 






L06797_s_at 


Hs.421986 


NMJ)03467; 
chemokine (C-X-C 
motif) receptor 4 


L10343_at 


Hs.1 12341 


NM_002638; skin- 
derived protease inhibi- 
tor 3 preproprotein 


L11708_at 


Hs.1 55109 


NM_002153; hydroxys- 
teroid (17-beta) dehy- 
drogenase 2 


L13391_at 


Hs.78944 


NMJ)02923; regulator 
of G-protein signalling 
2, 24kDa 


L13698_at 


Hs.65029 


NMJ>02048; growth 
arrest-specific 1 


L13720jat 


Hs.437710 


NMJ30O820; growth 
arrest-specific 6 


L13923_at 


Hs.750 


NM_000138; fibrillin 1 


AB000220_at 


HS.171921 


NMJ)06379; sema- 
phorin 3C 


AC002073_cds1_at 






AF0Q0231_at 


Hs.75618 


NMJJ04663; Ras- 
retated protein Rab-11A 


D10922_s_at 


Hs.99855 


NMJX)1462; formyi 
peptide receptor-like 1 


D10925_at 


Hs.301921 


NMJ)01295; 
chemokine (C-C motif) 
receptor 1 


D11086_at 


Hs.84 


NM_000206; Interteukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NM_001957; endotheiin 
receptor type A 


D13435_at 


Hs.426142 


NM_002643; phos- 
phatidylinosito! glycan, 
class F isoform 1 
NM_1 73074; phos- 
phatidytinositol glycan, 
class F isoform 2 
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D13666_s_at 


Hs.136348 


NM_006475; osteoblast 
din Mike) 


D14520_at 


Hs.84728 


NM_001730; Kruppel- 
like factor 5 


D21878_at 


Hs. 169998 


marrow stromal cell 
antigen 1 precursor ) 


D26443_at 


Hs.371369 


NMJ)04172; solute 
carrier family 1 (glial 
high affinity glutamate 
transporter), member 3 


D28589_at 


Hs.17719 




D42046_at 


Hs.194665 




D45370_at 


Hs.74120 


NMJW6829; adipose 

snpcific 2 

d^JCOIlIU mm 


D49372_s_at 


Hs.54460 


NM_002986; small 
precursor 


D50495_at 


Hs.224397 


NM_003195; transcrip- 
tion elongation factor A 
(Sll). 2 


D63135_at 


Hs.27935 


NMJD32646; tweety 
homolog 2 


D64053_at 


Hs.198288 


NM_002849; protein 
tyrosine pnospnatase, 
receptor type. R isoform 

NM_1 30846; protein 
tyrosine phosphatase, 

2 


D83920_at 


Hs.440898 


NMJ)02003; ficolin 1 


D85131_s_at 


Hs.433881 


NM_002383; MYC- 
associated zinc finger 


D86062__s_at 


Hs.413482 


some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


inivj__uu j i <cy, adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs. 105751 


NM_014720; Ste20- 
retated serine/threonine 
kinase 


D86976_at 


Hs.196914 
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D87433_at 


Hs.301989 


NMJ)15136; slabilin 1 


D87443_at 


Hs.409862 


NMJ)14758; sorting 
nexin 19 


D87682_at 


Hs. 134792 




D89077_at 


Hs.75367 


NM_006748; Src-like- 
adaptor 


D89377_at 


Hs. 89404 


NMJ)02449; msh 
homeo box homolog 2 


D90279_s_at 


Hs.433695 


NMJX)0093; alpha 1 
type V collagen prepro- 
protein 


HG1996-HT2044_at 






HG2090-HT2152_s_at 






HG2463-HT2559_at 






HG2994-HT4B50_s_at 






HG3044-HT3742_s__at 






HG3187-HT3366_s_at 






HG3342-HT351 9_s_at 






HG371-HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67J_at 






HG907-HT907_at 






J02871_s_at 


Hs.436317 


NMJ)00779; cyto 
chrome P450, family 4, 
subfamily B, polypep- 
tide 1 


J03040_at 


Hs.1 11779 


NMJ)03118; secreted 
protein, acidic, cysteine- 
rich (osteonectin) 


J03060_at 






J03068_at 






J03241_s_at 


Hs.2025 


NM_003239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NM_002609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925_at 


Hs. 172631 


NM_000632; integrin 
alpha M precursor 


J04056_at 


Hs.88778 


NMJJ01757; carbonyl 
reductase 1 


J04058_at 


Hs.169919 


NM_000126; electron 
transfer flavoprotein, 
alpha polypeptide 


J04093_s_at 


Hs.278896 


NM_019075; UDP 
glycosyl transferase 1 
family, polypeptide A10 
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J04130_s_at 


Hs.75703 


NM_002984; 
chemokine (C-C motiO 
ligand 4 precursor 


J04152_rna1_s_at 






J04162_at 


Hs.372679 


NM__000569;Fc frag- 
ment of IgG, low affinity 
Ilia, receptor for (CD16) 


J04456_at 


Hs.407909 


NM_002305; beta- 
galactosidase binding 
lectin precursor 


J05032_at 


Hs. 32393 


NM 001349* asoarM- 
tRIMA synthetase 


J05036_s_at 


Hs.1355 


NM_001910; cathepsin 
E iso form a nrenrnnm- 
tein NMJ148964; ca- 
thepsin E Isofonm b 
preproprotein 


J05070_at 


Hs.151738 


NM_004994; matrix 
metalloproteinase 9 
DrsoroDroteln 


J05448_at 


Hs.79402 


NM_002694; DNA 
directed RNA poly- 
merase II polypeptide C 
NM_032940; DNA 
directed RNA nolv- 
me rase II oolvDeotide C 


K01396_at 


Hs.297681 


NMJJ00295; serine (or 
cysteine) proteinase 
inhibitor, clade A fal- 
pha-1 antiproteinase, 
antitrypsin), member 1 


K03430_at 






L06797_s_at 


Hs.421986 


NM_003467; 
chemokine (C-X-C 
motif) receptor 4 


L10343_at 


Hs.1 12341 


NM_002638; skin- 
derived protease inhibi- 
tor 3 DreDroDroteln 


L11708_at 


Hs.155109 


NM_002153; hydroxys- 
drogenase 2 


L13391_at 


Hs.78944 


NM_002923; regulator 
of G-protein signalling 
2, 24kOa 


L13698_at 


Hs.65029 


NM_002048; growth 
arrest-specific 1 


L13720_at " 


Hs.437710 


NM_000820; growth 
arrest-specific 6 j 
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L13923_at 


Hs.750 


NM_000138; fibrillin 1 


AB000220_at 


Hs.171921 


NM_006379; sema- 
phorin 3C 


AC002073_cds1_at 






AF000231_at 


Hs.75618 


NM_004663; Ras- 
related protein Rab-1 1 A 


D10922_s_at 


Hs.99855 


NMJ)01462; fbrmyl 
peptide receptor-like 1 


D10925_at 


Hs.301921 


NM_001295; 
chemokine (C-C motiO 
receptor 1 


D11086_at 


Hs.84 


NM_000206; interieukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NM_001957; endothelin 
receptor type A 


D13435__at 


Hs.426142 


NM_002643; phos- 
phatidylinositol glycan, 
class F isoform 1 
NMJI73074; phos- 
phatidylinositol glycan, 
class F isoform 2 


D13666_s_at 


Hs.1 36348 


NMJ506475; osteoblast 
specific factor 2 (fasci- 
clin Mike) 


D14520_at 


Hs.84728 


NMJJ01730; Kruppel- 
iike factor 5 


021878.81 


Hs.1 69998 


NM 004334- bone 
marrow stromal cell 
antigen 1 precursor 


D26443_at 


Hs.371369 


NM_004172; solute 
carrier family 1 (glial 
high affinity gtutamate 
transporter), member 3 


D28589_at 


Hs.1 771 9 




D42046_at 


Hs.1 94665 




D45370_at 


Hs.74120 


NM_006829; adipose 
specific 2 


D49372_s_at 


HS.54460 


NM_002986; small 
inducible cytokine A1 1 
precursor 


D50495_at 


Hs.224397 


NM_003195; transcrip- 
tion elongation factor A 
(Sll). 2 


D63135_at 


Hs.27935 


NMJ)32646; tweety 
homolog 2 


D64053_at 


Hs.198286 


NMJ)02849; protein 
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tyrosine phosphatase, 

receptor type, R isoform 

1 precursor 

NM_1 30846; protein 

tyrosine phosphatase, 

receptor type, R Isoform 

2 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 
precursor 


D85131_s_at 


Hs.433881 


NM_002383; MYC- 
associated zinc finger 
protein 


D86062_s_at 


Hs.413482 


NM_004649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NM_001 129; adipocyte \ 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D86959_at 


Hs. 105751 


NM_014720; Ste20- 
related serine/threonine 
kinase 


D86976_at 


Hs.196914 




D87433_at 


Hs. 30 1989 


NM_015136; stabilin 1 


D87443_at 


Hs.409862 


NMJ)14758; sorting 
nexin 19 


D87682_at 


Hs. 134792 




D89077_at 


Hs.75367 


NM_006748; Src-like- 
adaptor 


D89377_at 


Hs. 89404 


NM_002449; msh 
homeo box homolog 2 


D90279_s_at 


Hs.433695 


NM_000093; alpha 1 
type V collagen prepro- 
protein 


HG1996-HT2044_at 






HG2090-HT2152_s_at 






HG2463-HT2559__at 






HG2994-HT4850_s_at 






HG3fJ44-HT3742._s_at 






HG3187-HT3366_s_at 






HG3342.HT3519_.s_at 






HG371 -HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67_f_at 






HG907-HT907_at 






J02871_s_at 


HS.436317 


NMJ500779; cyto- 
chrome P450, family 4. 
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subfamily B, polypep- 
tide 1 


J03040_at 


Hs.1 11779 


NMJ)03118; secreted 
protein, acidic, cysteine- 
rich (osteonectin) 


J03060_at 






J03068_at 






J03241_s_at 


Hs.2025 


NM_003239; transform- 
ing growth factor, beta 3 


J03278_at 


Hs.307783 


NMJ)02609; platelet- 
derived growth factor 
receptor beta precursor 


J03909_at 






J03925_at 


Hs.172631 


NM_000632; integrin 
alpha M precursor 


J04056_at 


Hs.88778 


NMJ)01757; carbonyl 
reductase 1 


J04058_at 


Hs.169919 


NM_000126; electron 
transfer flavoprotein, 
alpha polypeptide 


J04093_s_at 


Hs.278896 


NM_019075; UDP 
glycosyltransferase 1 
family, polypeptide A10 


J04130_s_at 


Hs.75703 


NM_002984; 
chemokine (C-C motif) 
ligand 4 precursor 


J04152_rna1_s_at 






J04162_at 


Hs.372679 


NM_000569; Fc frag- 
ment of IgG, low affinity 
Ilia, receptor for (C016) 


J04456_at 


Hs.407909 


NMJJ02305; beta- 
galactosidase binding 
lectin precursor 


J05032_at 


Hs.32393 


NM_001349; aspartyl- 
tRNA synthetase 


J05036_s_at 


Hs.1355 


NM_001910; cathepsin 
E isoform a prepropro- 
tein NM_148964; ca- 
thepsin E isoform b 
preproprotein 


J05070_at 


Hs. 151 738 


NM_004994; matrix 
metalloproteinase 9 
preproprotein 


J05448_at 


Hs.79402 


NM.002694; DNA 
directed RNA poly- 
merase II polypeptide C 
NM_032940; DNA 
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directed RNIA nnlv/. 

merase II polypeptide C 


K01396_at 


Hs.297681 


NM_000295; serine (or 
cysteine) proteinase 
inhibitor ciadeAfal- 
pha-1 antiproteinase, 
antitrypsin), member 1 


K03430_at 






L06797_s_at 


Hs.421986 


NMJ)03467; 
chemokine (C-X-C 
motif) receptor 4 


L10343_at 


Hs.112341 


NMJJ02638; skin- 
derived protease inhibi- 
tor 3 preproprotein 


L11708_at 


Hs. 1551 09 


NM_002153; hydroxys- 
teroid (17-beta) dehy- 
drogenase 2 


L13391_at 


Hs.78944 


NM 002923* reaulator 
of G-protein signalling 
2, 24kOa 


L13698_at 


Hs.65029 


NM 002048* arowth 
arrest-specific 1 


L13720_at 


Hs.437710 


NM_000820; growth 
arrest-specific 6 


L13923_at 


Hs.750 


NM_000138; fibrillin 1 


AB000220_at 


Hs.171921 


NM 006379* sema- 
phorin 3C 


AC002073_cds1_at 






AF000231_at 


Hs.75618 


NM 004663* Ra«s- 
related protein Rab-11A 


D10922__s_at 


Hs.99855 


NM_001462; formyl 
Dec-tide receotnr-likA 1 


D10925_at 


Hs.301921 


NM.001295; 
chemokine fC«C mntif*i 
receptor 1 


D11086_at 


Hs.84 


NMJ)00206; interieukin 
2 receptor, gamma 
chain, precursor 


D11151_at 


Hs.211202 


NM_001957; endothelin 


D13435_at 


Hs.426142 


NM_002643; phos- 
phatidyl inositol glycan, 
class F isofbrm 1 
NM_1 73074; phos- 
phatidylinositol glycan, 
class F isoform 2 


D13666_s_at 


Hs. 136348 


NM_006475; osteoblast 
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specific factor 2 (fasd- 
din Mike) 


D14520_at 


Hs.84728 


NM_001730; Kruppel- 
like factor 5 


D21878_at 


Hs.169998 


NM_004334; bone 
marrow stromal cell 
antigen 1 precursor 


D26443__at 


Hs.371369 

« 


NM_004172; solute 
carrier family 1 (glial 
high affinity giutamate 
transporter), member 3 


D28589_at 


Hs.17719 




D42046_at 


Hs. 194665 




D45370_at 


Hs.74120 


NM_006829; adipose 
spedfic 2 


D49372_s_at 


Hs.64460 


NM_002986; small 
indudble cytokine A1 1 
precursor 


D50495_at 


Hs.224397 


NMJ)03195; transcrip- 
tion elongation factor A 
(Sll), 2 


D63135_at 


Hs.27935 


NMJJ32646; tweety 
homolog 2 


D64053_at 


Hs. 198288 


NM__002849; protein 
tyrosine phosphatase, 
receptor type, R isoform 
1 precursor 
NNM 30846; protein 
tyrosine phosphatase, 
receptor type, R isoform 
2 


D83920_at 


Hs.440898 


NM _002003; ficolin 1 
precursor 


D85131_s_at 


Hs.433881 


NM JW2383; MYC- 
associated zinc finger 
protein 


D86062_s_at 


Hs.413482 


NM_004649; chromo- 
some 21 open reading 
frame 33 


D86479_at 


Hs.439463 


NMJ301129; adipocyte 
enhancer binding pro- 
tein 1 precursor 


D86957_at 


Hs.307944 




D66959_at 


Hs.105751 


NMJ>14720; Ste20- 
related serine/threonine 
kinase 


D86976_at 


Hs.196914 
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D87433_at 


Hs.301989 


NMJ015136; stabilln 1 


D87443 at 


Hs 409862 


mm 014758* sortina 
nexin 19 


D87682_at 


HS. 134792 




D89077 at ~" 


Hs 75367 


mm n 0674 ft- *srrJikf»- 
adaotof 


D89377 at 


Hs 89404 


mm 0Q2449* msh 

InlVI V>VJ^*y*TJ t i| toll 

homeo box homoloo *? 


D90279_s_at 


Hs.433695 


NMJ300093; aipha 1 
type V collagen prepro- 
protein 


HG1996-HT2044_at 






HG2090-HT2152_s__at 






HG2463-HT2559_at 






HG2994-HT4650_s_at 







Table 10. 160 Genes for classifier 



Chip acc. # 


UniGene Build 162 


description 


AF000231_at 


Hs.75618 


NMJ)04663; Ras-related protein Rab-11A 


D13666_s_at 


Hs.1 36348 


NM_006475; osteoblast specific factor 2 (fascJclin l-Jike) 


D21878_at 


Hs.169998 


NM_004334; bone marrow stromal cell antigen 1 precursor 


D45370_at 


Hs.74120 


NM_006829; adipose specific 2 


D49372_s_at 


Hs.54460 


NM_002986; small inducible cytokine A1 1 precursor 


D83920_at 


Hs.440898 


NM_002003* ficolin 1 precursor 


D85131_s_at 


Hs.433881 


NMJ)02383; MYC-assoctated zinc finger protein 


D86062_s_at 


Hs.4 13482 


NM_004649; chromosome 21 open reading frame 33 


D86479_at 


Hs.439463 


NM_001129; adipocyte enhancer binding protein 1 precursor 


D86957_at 


Hs.307944 




D86976_at 


Hs. 19691 4 




D87433_at 


Hs.301989 


NM_015136; stabilln 1 


D89077_at 


Hs.75367 


NM_006748; Src-like-adaptor 


D89377_at 


Hs.89404 


NM_002449; msh homeo box homolog 2 


HG3044-HT3742_s_at 






HG371-HT26388_s_at 






HG4069-HT4339_s_at 






HG67-HT67_f_at 






HG907-HT907_at 






J02871_s_at 


Hs.436317 


NM_000779; cytochrome P450, family 4, subfamily B. polypeptide 
1 


J03040_at 


Hs.11 1779 


NM_003118; secreted protein, acidic, cysteine-rich (osteonectin) 


J03068_at 






J03241,s_at 


Hs.2025 


NM_Q03239; transforming growth factor, beta 3 


J03278_at 


Hs.307783 


NMJJ026G9; platelet-derived growth factor receptor beta precursor 


J03909_at 






J04058_at 


Hs. 16991 9 


NM_000126; electron transfer ftavoprotein, alpha polypeptide 
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J04130_s_at 


Hs.75703 


NM_002984; chemokine (C-C motiO ligand 4 precursor 




Hs.372679 


NM_000569; Fc fragment of IgG, low affinity Ilia, receptor for 
(CD16) 


J044OD_at 


Hs.407909 


NMJ)02305; beta-galactosidase binding lectin precursor 




Hs. 32393 


NM_001349; aspartyl-tRNA synthetase 


juouf u_ai 


Hs. 151 738 


NM_004994; matrix metalloproteinase 9 preproprotein 


JUD44q_ai 


Hs.79402 


NM__002694; DNA directed RNA polymerase II polypeptide C 
NM_032940; DNA directed RNA polymerase II polypeptide C 


KHIIQfi at 


rts.^y/ool 


NMJ)00295; serine (or cysteine) proteinase inhibitor, clade A 
(aipha-1 antiproteinase, antitrypsin), member 1 


KO^l^n at 






L13698 at 




nm_uu\cU4o; growth arrest-specific 1 


L13720 at 




nm_uuoo2u; growth arrest-specific 6 


I 1392^ at 


ns.f du 


NM_000138; fibrillin 1 


L15409_at 


Hs.421597 


NM_000551; elogin binding protein 


L17325_at 


Hs. 195825 


NM_006867; RNA-binding protein with multiple splicing 


L19872_at 


Hs. 170087 


NM_001621; aryl hydrocarbon receptor 


L27476_at 


Hs.75608 


NM_004817; tight junction protein 2 (zona occludens 2) 


L»53799_at 


Hs.202097 


NMJ)02593; procollagen C-endopeptidase enhancer 


L403oo_at 


Hs.30212 


NM_004236; thyroid receptor interacting protein 15 


L40904__at 


Hs.387667 


NMJJ05037; peroxisome proliferative activated receptor gamma 
isofonm 1 NM_0 15869; peroxisome proliferative activated receptor 
gamma isoform 2 NM_1 38711; peroxisome proliferative activated 
receptor gamma isoform 1 NNM38712; peroxisome proliferative 
activated receptor gamma isoform 1 








M11433_at 


Hs. 101 850 


NMJJ02899; retinol binding protein 1, cellular 


M1171ft at 

IVI 1 1 f I Q ol 




NM_0Q0393; alpha 2 type V collagen preproprotein 


M1212S at 


HS.oUU77Z 


NM_003289; tropomyosin 2 (beta) 


M14218 at 

IVI Iti IO dl 


Wo >i>*on>i7 

ns.«M»^U4f 


NMJD00048; argininosuccinate lyase 


MIRSQ 1 ? at 

IVI UJqJjBl 


Ur< 07C0C7 

ns. 0/ oyo7 


NMJ)00211; integrin beta chain, beta 2 precursor 


M16591_s_at 


Hs.89555 


NM_002110; hemopoietic cell kinase isoform p61HCK 


ivi 1 1 £\ y__ai 


Hs. 203862 


NM_002069; guanine nucleotide binding protein (G protein), alpha 
inhibiting activity polypeptide 1 


M?n^n at 






M2"*17A o at 


HS.73817 


NM_002983; chemokine (C-C motif) ligand 3 


M28130_rna1_s_at 






ivLcyoDU_at 


Hs. 187543 


NM_021132; protein phosphatase 3 (formerly 28), catalytic sub- 
unit, beta isoform (calcineurin A beta) 


mo i ioo_ai 


Hs.407546 


NMJJ071 15; tumor necrosis factor, alpha-induced protein 6 pre- 
cursor 


M32011_at 


Hs.949 


NMJ)00433; neutrophil cytosolic factor 2 


M33195__at 


Hs.433300 


NM_004106; Fc fragment of IgE. high affinity I, receptor for, 
gamma polypeptide precursor 


M37033_at 


Hs.443057 


NM_000560; CD53 antigen 


M37766_at 


Hs.901 


NM_001 778; CD48 antigen (B-cell membrane protein) | 


M55998__s_at 


Hs. 172928 


NMJ)G0088; alpha 1 type I collagen preproprotein 
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M57731 s at 


na.f 3f \jO 






Me A*>*^A*> 


rvjivi uu iDor , acyioxyacyi nyuroiaso precursor 


M63262 at 








ns.i ooiuy 


NM — 000240; monoamine oxidase A 


ivio574uo_5_ai 


MS./ Of 03 


NM__002984, cnemokine (C-C motif) hgand 4 precursor 


IVI f Ilia 1 o a I 






M77349 at 




K1K4 nnnlCQ' trim fj-.iiLiL.-i .nwuxtki fn j>tj-i.r k A |- , fioi,r\_ 

iNivi_uuuooo > iransrorming growth factor, Deta-induced, 68k Da 


at 




nm^i r'ZJ^o, C74-UK6 factor i (ets oomatn transcnption factor) 


M83822_at 


Hs.209846 


NMJJ06726; LPS-responsive vesicle trafficking, beach and anchor 
containing 




P1S.41UUJ7 


kill AA4 ftA4 • ■ x?_ . _ .. - . . .Ai- ^ ■ 

NM_001901; connective tissue growth factor 


MQC4 7Q of 

Magi fO_dl 


PIS. I lyuoo 


kill aaj «4 AA. ^ ^.a; I | i_ j 

NM_001102; actinin, alpha 1 


ooyi io_ai 


MS. lU^JUb 


NM_005601 ; natural killer cell group 7 sequence 


or i «5yo_ai 


ns.i4o/o4 


NM__016531; KruppeMike factor 3 (basic) 




PIS. lOOf OiC 


NM_004358; cell division cycle 25B isoform 1 NM_021872; cell 
division cycle 25B isoform 2 NM_021873; cell division cycle 25B 
isoiorm j nm_uzi o74 f cell division cycle 25B isoform 4 


U01833_at 


Hs.81469 


NMJ302484; nucleotide binding protein 1 (MinD homolog, E. coli) 




pis.ouy/oo 


NM_002092; G-rich RNA sequence binding factor 1 


UU9& / O a I 


MS.40Q004 


NM_004460; fibroblast activation protein, alpha subunit 


uu?9o f ma i s ai 






U10550_at 


Hs.79022 


NMJ)05261; GTP-binding mitogen-induced T-cell protein 
NM_1 81702; GTP-binding mitogen-induced T-cell protein 


U12424_s_at 


Hs.108646 


NM_000408; glycerol-3-phosphate dehydrogenase 2 (mitochon- 
drial) 


U16306_at 


Hs.434488 


NM_004385; chondroitin sulfate proteoglycan 2 (versican) 


U20158_at 


Hs.2488 


NM_005565; lymphocyte cytosolic protein 2 


u^uodo_s__at 


Hs.3280 


NMJ)01226; caspase 6 isoform alpha preproprotein NM_032992; 
caspase 6 isoform beta 


uz4^oo_ai 


Hs.77448 


NM_Q03748; aldehyde dehydrogenase 4A1 precursor 
NM_1 70726; aldehyde dehydrogenase 4A1 precursor 


u^o^«fy^_ai 


ns. 301350 


NM_005971; FXYD domain containing ion transport regulator 3 
isoform 1 precursor NM_021910; FXYD domain containing ion 
transport regulator 3 isoform 2 precursor 


U 28488 s at 


ns. i ooyoD 


NM_004054; complement component 3a receptor 1 


U29680 at 


Me 997A17 


Nivi_004049; BCL2-related protein A1 


U37143 at 


Me 1 *>0nQR 


NM_000775; cytochrome P450, family 2, subfamily J, polypeptide 

o 
Z 


U38864_at 


Hs. 1081 39 


NM_012256; zinc finger protein 212 


U39840_at 


Hs.1 63484 


NM_004496; forkhead box A1 


U41315_ma1_s_at 






U44111_at 


Hs.42151 


NMJ506895; histamine N-methyltransferase 


U47414_at 


Hs.13291 


NM_004354; cyclin G2 


U49352_at 1 


Hs.414754 


NM__001359; 2,4-dienoyl CoA reductase 1 precursor 


U50708_at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase E1. beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 




T/DK2003/000750 



108 



U52101_at 


Hs.9999 


NM_001425; epithelial membrane protein 3 


U59914_at 


Hs. 153863 


NM.005585; MAD, mothers against decapentaplegic homolog 6 


U60205_at 


Hs.393239 


NM_006745; sterol-C4-methy1 oxidase-Uke 


U61981_at 


Hs.42674 


NM_002439; mutS homolog 3 


U64520_at 


Hs.66708 


NM_004781; vesicle-associated membrane protein 3 (cellubrevin) 


U65093_at 


Hs.82071 


NM_006079; Cbp/p300-interactlng transactivator, with Glu/Asp- 
rich carboxy-terminal domain, 2 


U66619_at 


Hs.444445 


NM_003078; SWI/SNF-related matrix-associated actin-dependent 
regulator of chromatin d3 


U68019_at 


Hs.288261 


NM_005902; MAD, mothers against decapentaplegic homolog 3 


U68385_at 


Hs.380923 




U68485_at 


Hs. 1931 63 


NM_004305; bridging integrator 1 isoform 8 NM 139343; bridging 
integrator 1 isoform 1 NM_1 39344; bridging integrator 1 isoform 2 
NMjl 39345; bridging integrator 1 isoform 3 NM_1 39346; bridging 
integrator 1 isoform 4 NMJI 39347; bridging integrator 1 isoform 5 
NM_1 39348; bridging integrator 1 isoform 6 NM_1 39349; bridging 
integrator 1 isoform 7 NM_1 39350; bridging integrator 1 isoform 9 
NM_1 39351; bridging integrator 1 isoform 10 


U74324_at 


Hs.90875 


NM_002871; RAB-interacting factor 


U77970_at 


Hs.321164 


NM_002518; neuronal PAS domain protein 2 NM_032235; 


U83303.cds2.at 


Hs.164021 


NM_002993; chemoklne (C-X-C motif) ligand 6 (granulocyte 
chemotactic protein 2) 


U88871_at 


Hs.79993 


NM_000288; peroxisomal biogenesis factor 7 


U90549_at 


Hs.236774 


NM_006353; high mobility group nucleosomal binding domain 4 


U90716_at 


Hs.79187 


NM_001338; coxsackie virus and adenovirus receptor 


V00594_at 


Hs. 118786 


NMJJ05953; metallothionein 2A 


V00594_s_at 


Hs. 11 8786 


NM JH35953; metallothionein 2A 


X02761_s_at 


Hs.418138 


NMJJ02026; fibronectin 1 isoform 1 preproprotein NM_054034; 
fibronectin 1 isoform 2 preproprotein 


X04011_at 


Hs.88974 


NM_000397; cytochrome b-245. beta polypeptide (chronic granu- 
lomatous disease) 


X04085_rna1_at 






X07438_s_at 






X07743_at 


Hs.77436 


NMJ)02664; pleckstrin 


X13334_at 


Hs.75627 


NM_000591; CD14 antigen precursor 


X14046_at 


Hs.1 53053 


NMJKH774; CD37 antigen 


X14813__at 


Hs. 1661 60 


NM_001607; acetyl-Coenzyme A acyl transferase 1 


X15880_at 


Hs.415997 


NM_001848; collagen, type VI, alpha 1 precursor 


X15882_at 


Hs.420269 


NMJH)1849; alpha 2 type VI collagen isoform 2C2 precursor 
NMJ)58174; alpha 2 type VI collagen isoform 2C2a precursor 
NM_058175; alpha 2 type VI collagen Isoform 2C2a precursor 


X51408_at 


Hs.380138 


NMJ)01822; chimerin (chimaerin) 1 


X53800_s_at 


Hs.89690 


NMJD02090; chemokine (C-X-C motif) ligand 3 


X54489_rna1_at 






X57351_s_at 


Hs. 1741 95 


NM_006435; interferon induced transmembrane protein 2 (1-8D) 


X57579_s__at 






X58072_at 


Hs. 169946 


NM_G02051; GATA binding protein 3 NM_032742; 
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X62048 at 



X64072_s_at 



X65614_at 
X66945_at 



X67491_f_at 
X68194_at 



X73882_at 
X78520_at 



X78549_at 
X78565 at 



X84908_at 



Y08374_ma1_at 



Z12173_at 



219554_s_at 
Z26491_s_at 



Z29331_at 
Z35491_at 



Z48199_at 



Z48605_at 



Z74615_at 



Hs.249441 



Hs.375957 



Hs.2962 
Hs.748. 



Hs.355697 
Hs.80919 



Hs.254605 



Hs.372528 



Hs.51133 
Hs.98998 



Hs.79088 
Hs.59889 



Hs.78060 



Hs.407856 
Hs.624 
Hs.75216 



Hs.334534 



Hs.435800 
Hs.240013 



Hs.372758 

Hs.377484 
Hs.82109 



Hs.421825 



Hs.172928 



109 



NM_O03390; weel tyrosine kinase 



NM__000211; integrin beta chain, beta 2 precursor 



NM_005980; S100 calcium binding protein P 
NM_u00604; fibroblast growth factor receptor 1 isoform 1 precur 
sor NMJ)15850; fibroblast growth factor receptor 1 isoform 2 
precursor NM_023105; fibroblast growth factor receptor 1 isoform 
3 precursor NM.023106; fibroblast growth factor receptor 1 iso- 
form 4 precursor NM_023107; fibroblast growth factor receptor 1 
isoform 5 precursor NMJ)23108; fibroblast growth factor receptor 
1 isoform 6 precursor NM_023109; fibroblast growth factor recep- 
tor 1 isoform 7 precursor NMJ)231 10; fibroblast growth factor 
receptor 1 isoform 8 precursor NMJ)231 1 1; fibroblast growth 
factor receptor 1 1soform 9 precursor 



NM_005271; glutamate dehydrogenase 1 
NM_006754; synaptophysfn-like protein isoform a NM_182715; 
synaptophysin-iike protein isoform b 



NM__003980; microtubule-associated protein 7" 



NMJ)01829; chloride channel 3 



NIVL005975; PTK6 pr otein tyrosine kinase 6 ' 
NMJ)02160; tenascin C (hexabrachlon) " 



NIMQQ2902; reuculocalbin 2 t EE-hand calciu m binding domain 

NM 005518; 3.hyaroxy-3HTiemylglufaryf-Coenzyme A synthase 2 
(mitochondrial) 



NM,O0Q293; phosphorylase kinase, beta 




NMJ)Q3122; serine p rotease inhibitor, Kazal type f 
NM__000584; interleukin 8 precursor 
nm^002840; protein tyrosine phosphatase, receptor type. F iso- 
form 1 precursor NM_130440; protein tyrosine phosphatase, 
receptor type, F isoform 2 precursor 



NIVL002076; glucosamine (N-acetyl)-6-sulfatase precursor 



NM_003380; vimentin 



nm.000754; catechol-O-methyltransferase isoform MB-COMT 
NMJ)07310; catechol-O-methyltransferase Isoform S-COMT 



NM_003344; UDiqultin-conjugating enzyme E2H isoform 1 
NM_182697; ubiquitin-conjugati ng enzyme E2H isoform 2 
NIVLU04323; BCLZ-associated a thanogene IsofomTlL — 
NM_002997; syndecan 1 — 



NM_006903; inorganic pyrophosphatase 2 isoform 2 NM_1 76865; 
NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NM_1 76869; inorganic 
pyrophosphatase 2 isoform 1 



inm_qqoo88; alpha 1 type I collagen preproprotein 
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AF000231_at 
D13666 s at 



UniGene Build 162 



Hs.75618 



description 



NM_004663; Ras-related protein Rab-11A 



Hs.136348 



D49372_s_at 



Hs.54460 



D83920 at 



Hs.440898 



NM_0Q6475; osteoblast specific factor 2 (fascictin l-like) 



NM_002986; small inducible cytokine A11 precursor 



NMJ)02003; ficolin 1 precursor 



NMJ01 129; adipoc yte enhancer binding protein 1 precursor" 
NMJ)15136; stabiiin 1 



D87433_at 
D89077 at 



D89377 at 



Hs.89404 



HG4069-HT4339_s_at 
HG67-HT67 f at 



NM_006748; Src-like-adaptor 



NMJ>02449; msh homeo box homolog 2 



J02871_s at 



NMJX)0779; cytochrome P450. family 4, subfamily B, polypeptide 
1 



J03278__at 
J04058 at 



J05032 at 



J05070_at 



J05448_at 



K01396 at 



Hs.307783 
Hs. 16991 9 



Hs.32393 



Hs. 151 738 



Hs.79402 



Hs.297681 



NM ,002609; platelet-derived growth factor receptor beta precursor 
NM_00Q126; eiectron transfer flavoprotein, alpha polypeptide 



NM_001349; aspartyl-tRNA synthetase 



NM_004994; matrix metalloproteinase 9 preproprotein 



NM_002694; Dna directed RNA polymerase II polypeptide C 
NMJ32940; DNA directed RNA polymerase II polypeptide C 



nm_000295; serine (or cysteine) proteinase inhibitor, clade A 
(alpha-1 antiproteinase, antitrypsin), member 1 



NM_000820; growth arrest-specific 6 
NM_005037; peroxisome proliferative activated receptor gamma" 
isofonn 1 NM_015869; peroxisome proliferative activated receptor 
gamma isoform 2 NlvM38711; peroxisome proliferative activated 
receptor gamma isoform 1 NM.138712; peroxisome proliferative 
activated receptor gamma isoform 1 



M 15395 at 



M16591_s_at 
M20530 at 



M23178_s_at 
M32011 at 



Hs.375957 
Hs.89555 



Hs.73817 
Hs.949 



NM_003289; tropomyosin 2 (beta) 



NM,0QQ211; Integrin beta chain, beta 2 precursor 



NM_002110; hemopoietic cell kinase isoform p61HCK 



NIVL002983; chemokine (C-C motif) ligand 3 ' 



NM_000433; neutrophil cytosolic factor 2 



M55998_s_at 



Hs. 172928 
Hs.75765 



NM,004106; Fc rragment of Ig E, high affinity I, receptor for, 
gamma polypeptide precursor 



NM,0OQ088; alpna i type I collagen preproproteTrT 



NM J?02089; chemowne (C-X-C motif) ligand 2 
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containing 


Off 030_dl 


u Q 14C7CA 


inivi^u 1 DOw 1 , r\rupp6MiK0 ractor 0 (Dasicj 


U01833 at 


Me A44RQ 


iNivi_uu^4o4, nucleotide Dinoing protetn 1 (Minu homolog, E. coli) 


U07231 at 


no.wVoV OO 


iNivi_uuzuy^, o-ncn kima sequence uinoing factor 1 


1109937 ma1 « at 






\J 1 u^w_di 




kill J nrtMCi • PTO Kin*4ta«M iviitn/inn imWiim^ T -«4.U 

i>iivi_uud^o i , bi p-Dinaing muogen-inauced T-cell protetn 
inivi_i 01 rw, r-omaing mitogen-maucea i-ceii protein 


U20158_at 


Hs.2488 


NM_005565; lymphocyte cytosolic protein 2 


U28488_s_at 


Hs. 155935 


NM_004054; complement component 3a receptor 1 




MS.Z£rol7 


NM_004049; BCL2-related protein A1 


U*» 1 0 1 w_lilca 1 5> al 






1 147414 at 




NM_Q04354; cyclin G2 


1 at 


nS.4147w4 


NM_001359; 2,4-dienoyl CoA reductase 1 precursor 


U50708_at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase E1 , beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U52101 at 




iNivi_uui42o t epithelial membrane protein 3 


U59914 at 


no. I jOOQO 


iNM^uuowww, mad, mothers against decapentaplegic homolog 6 


U64520 at 


nS.DD/Uo 


NM_004781 ; vesicle-associated membrane protein 3 (cellubrevin) 


U65093 at 

wwww ww_w 1 


He 89071 


NM_uubwf y, UDp/poOO-interactng transact] vator, with Glu/ Asp- 
rich carboxy-terminal domain, 2 


U68019_at 


Hs 288261 


iNivi^uuowU^, mau, mom ens against decapentaplegic homolog 3 


U68385_at 


Hs 380923 

no. vOVMw 




U74324_at 


He Q0B75 


inm_uu^oyi; KAb-interacting tactor 


U77970_at 


Hs 321 1 fid 

1 19. O^. 1 1 W*T 


inm — uuzoio, neuronal KAo domain protein 2 NM — 032235; 


U90549 at 


He 236774 


nm_ UU0003, high mobility group nucleosomal binding domain 4 


X04085_ma1_at 






X07438_s_at 






X07743 at 


MS. 7 f 430 


NM_002664; pteckstrin 




nS.f DOXS 


NM_000591; CD14 antigen precursor 


X 14046 at 
/\ 1 two 01 


PIS. lOoUOo 


NM_001774; CD37 antigen 


X15880 at 

^\ • www w I 


He A1<\QQ7 


NM_uul848; collagen, type VI, alpha 1 precursor 


X15882 at 


no .H^utoy 


rMM_ooi849; alpha 2 type VI collagen isoform 2C2 precursor 
nm_05w174; alpha 2 type VI collagen isoform 2C2a precursor 
inm_uoo i ro, aipna ^ type vi collagen isoform 2C2a precursor 


X51408_at 


Hs 380138 

no. wOw 1 ww 


inm^uuio^s, cnimenn (cnimaenn) 1 


X53800_s_at 


Hs.89690 


NM_002090; chemokine (C-X-C motif) ligand 3 


X5448Q ma1 at 






X57579 s at 1 






X62048 at 


He OAQA41 


NM_003390; weel tyrosine kinase 


X64072_s_at 


Hs.375957 


NM 00021 1 * intearin beta chain beta 2 nrecursor 


X67491JL_at 


Hs.355697 


NM_005271; glutamate dehydrogenase 1 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein isoform a NM_182715; 
synaptophysin-like protein isoform b 


X73682_at 


Hs.254605 


NM_003980; microtubufe-associated protein 7 


X78520_at 


Hs.372528 


NMJ301829; chloride channel 3 


X97267_ma1_s__at 
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Y00787_s_at 


Hs.624 


NM_000584; inteneuKin o precursor 


Z12173_at 


HS.334534 


NM_002076; glucosamine (N-acetyl)-6-sulfatase precursor 


Z19554_s_at 


Hs.435800 


NM_003380; vimentin 


Z26491_s_at 


Hs.240013 


NMJ500754; catechol-O-methyltransferase Isoform MB-COMT 
NM_007310; catechol-O-methyltransf erase isoform S-COMT 


Z29331_at 


Hs.372758 


NM_003344; ubiquitin-conjugating enzyme E2H Isoform 1 
NM_1 82697; ubiquitin-conjugating enzyme E2H Isoform 2 


Z48605_at 


Hs.421825 


NM_G06903; Inorganic pyrophosphatase 2 Isoform 2 NM_1 76865; 
NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NMjl 76869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs.172928 


NM_000088; alpha 1 type I collagen preproprotein 


Table 12. 40 genes for classifier 


Chip acc. # 


UniGene Build 162 


description 


D83920_at 


Hs.440898 


NM_002003; ficolin 1 precursor 


D89377_at 


Hs.89404 


NMJD02449; msh homeo box homolog 2 


J02871_s_at 


Hs.436317 


NM_000779; cytochrome P450, family 4, subfamily B, polypeptide 
1 


J05032_at 


Hs.32393 


NM_001349; aspartyl-tRNA synthetase 


J05070_at 


Hs.151738 


NM_004994; matrix metaltoproteinase 9 preproprotein 


M16591_s_at 


Hs.89555 


NM_0021 10; hemopoietic cell kinase isoform p61HCK 


M23178_s_at 


Hs.73817 


NM_002983; chemokine (C-C motif) ligand 3 


M32011_at 


Hs.949 


NM_000433; neutrophil cytosolic factor 2 


M33195_at 


Hs.433300 


NM_004106; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


M57731_s_at 


Hs.75765 


NM_002089; chemokine (C-X-C motif) ligand 2 


M68840__at 


Hs. 1831 09 


NMJ)00240; monoamine oxidase A 


M69203_s_at 


Hs.75703 


NM_002984; chemokine (C-C motif) ligand 4 precursor 


S77393_at 


Hs.145754 


NM_016531; Kruppel-like factor 3 (basic) 


U01833_at 


Hs.81469 


NM_002484; nucleotide binding protein 1 (MinD homolog, E. coli) 


U07231_at 


Hs.309763 


NM_002092; G-rich RNA sequence binding factor 1 


U09937_ma1_s_at 






U20158_at 


Hs.2488 


NM_005565; lymphocyte cytosolic protein 2 


U41315_ma1_s_at 






U47414_at 


Hs.13291 


NM_004354; cyclin G2 


U49352_at 


Hs.414754 


NM_001359; 2,4-dienoyl CoA reductase 1 precursor 


U50708_at 


Hs.1265 


NM_000056; branched chain keto acid dehydrogenase E1, beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U65093_at 


Hs.82071 


NM_006079; Cbp/p300-interacting transactivator, with Glu/Asp- 
rich carboxy-terminal domain, 2 


U68385_at 


Hs.380923 




U77970_at 


Hs.321164 


NMJ)02518; neuronal PAS domain protein 2 NM_032235; 


U90549_at 


Hs.236774 


NM _006353; high mobility group nucleosomal binding domain 4 


X13334_at 


Hs.75627 


NMJJ00591; CD14 antigen precursor 
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X15880_at 


Hs.415997 


NM_001848; collagen, type VI. alpha 1 precursor 


X15882_at 


Hs.420269 


NM_001849; alpha 2 type VI collagen isoform 2C2 precursor 
NM_058174; alpha 2 type VI collagen isoform 2C2a precursor 
NM_058175; alpha 2 type VI collagen isoform 2C2a precursor 


X514Q8_at 


Hs.380138 


NMJJ01822; chimerin (chimaerin) 1 


X53800_s_at 


Hs.89690 


NMJ)02090; chemokine (C-X-C motif) ligand 3 


X54489_rna1_at 






X57579_s_at 






X64072_s_at 


Hs.375957 


NM_00021 1 ; Integrin beta chain, beta 2 precursor 


X67491_f_at 


Hs.355697 


NM_005271; glutamate dehydrogenase 1 


X68194_at 


Hs.80919 


NMJ)06754; synaptophysin-like protein isoform a NM_1 8271 5; 
synaptophysln-like protein isoform b 


X73882_at 


Hs.254605 


NMJ503980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NM_001829; chloride channel 3 


Z29331_at 


Hs.372758 


NM_003344; ubiquitin-conjugating enzyme E2H isoform 1 
NM_1 82697; ubiquitin-conjugating enzyme E2H isoform 2 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 NMJ 76865; 
NNM 76866; inorganic pyrophosphatase 2 isoform 3 NMjl 76867; 
inorganic pyrophosphatase 2 isoform 4 NM_1 76869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs. 172928 


NM_000088; alpha 1 type I collagen preproprotein 


Table 13. 20 genes for 


• classifier 


Chip acc. # — " " 


UniGene Build 162 


description 


D89377_at 


Hs.89404 


NM_002449; msh homeo box homolog 2 


J05032_at 


Hs.32393 


NM_001349; aspartyl-tRNA synthetase 


M23178_s_at 


Hs.73817 


NMJ502983; chemokine (C-C motiO ligand 3 


M32011_at 


Hs.949 


NM__000433; neutrophil cytosolic factor 2 


M69203_s_at 


Hs.75703 


NM_002984; chemokine (C-C motif) ligand 4 precursor 


S77393_at 


Hs. 145764 


NMJ) 16531; KruppeWike factor 3 (basic) 


U07231_at 


Hs.309763 


NM_002092; G-rich RNA sequence binding factor 1 


U41315_rna1_s_at 






U47414_at 


Hs.13291 


NM_004354; cyclln G2 


U49352_at 


Hs.414754 — — 


NM_001359; 2,4-dienoyl CoA reductase 1 precursor 


U50708_at 


Hs.1265 — 


NM_000056; branched chain keto acid dehydrogenase E1 , beta 
polypeptide precursor NMJI 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U77970_at 


Hs.321164 


NMJM2518; neuronal PAS domain protein 2 NM_032235; 


X13334_at 


Hs.75627 


NM_000591; CD14 antigen precursor 


X57579_s_at 






X64072_s_at 


Hs.375957 


NM_000211; integrin beta chain, beta 2 precursor 


X68194_at 


Hs.80919 


NMJD06754; synaptophysin-like protein isoform a IMM_182715; 
synaptophysin-like protein isoform b 


X73882_at 


Hs.254605 


NMJW3980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NM_001829; chloride channel 3 


Z48605__at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 NM_1 76865; j 



SUBSTITUTE SHEET (RULE 26) 



WO 2004/040014 



r CT/DK2003/000750 



114 







NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NM_1 76869; inorganic 
pyrophosphatase 2 isoform 1 


27461 5_at 


Hs. 172928 


NM_000088; alpha 1 type I collagen preproprotein 


Table 14. 10 genes for classifier 


Chip acc. # 


UniGene Build 162 


description 


D89377_at 


Hs.89404 


NM_002449; msh homeo box homolog 2 


S77393_at 


Hs.145754 


NMJD16531; Kruppel-like factor 3 (basic) 


U41315_ma1_s_at 






U47414_at 


Hs.13291 


NM_004354; cyclin G2 


U77970_at 


Hs.321164 


NM_002518; neuronal PAS domain protein 2 NM_032235; 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein isoform a NNM82715; 
synaptophysin-like protein isoform b 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NM_001829; chloride channel 3 


Z48605_at " 


Hs.421825 


NMJ)06903; inorganic pyrophosphatase 2 isoform 2 NM_1 76865; 
NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NMJ 76869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs.1 72928 


NM_000088; alpha 1 type I collagen preproprotein 


Table 15. 32 genes for classifier 


unip acc. # 


UniGene Build 162 


description 


D83920_at 1 


Hs.440898 


NM_002003; ficolin 1 precursor 


HG67-HT67_f_at 






HG907-HT907_at 






J05032_at 


Hs.32393 


NMJKM349; aspartyl-tRNA synthetase 


K01396_at 


Hs.297681 


NM_000295; serine (or cysteine) proteinase inhibitor, clade A 
(alpha-1 antiproteinase, antitrypsin), member 1 


M16591_s_at 


Hs.89555 


NMJ)021 10; hemopoietic cell kinase isoform p61HCK 


M32011_at 


Hs.949 


NM_000433; neutrophil cytosolic factor 2 


M33195_at 


Hs.433300 


NM_004106; Fc fragment of IgE, high affinity I, receptor for, 
gamma polypeptide precursor 


M37033_at 


Hs.443057 


NM_000560; CD53 antigen 


M57731_s_at 


Hs.75765 


NM_002089; chemokine (C-X-C motif) ligand 2 


M63262_at 






S77393_at 


Hs.145754 I 


NM_016531; KruppeWike factor 3 (basic) 


U01833_at 


Hs.81469 


NM_002484; nucleotide binding protein 1 (MinD homolog, E. coli) 


U07231_at 


Hs.309763 | 


NM.002092; G-rich RNA sequence binding factor 1 


U41315_ma1_s_at 






U47414_at 


Hs.13291 


NM_004354; cyclin G2 


U50708_at 


Hs.1265 


NMJJ00056; branched chain keto acid dehydrogenase E1, beta 
polypeptide precursor NM_1 83050; branched chain keto acid 
dehydrogenase E1, beta polypeptide precursor 


U52101_at 


Hs.9999 


NM_001425; epithelial membrane protein 3 
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U74324_at 


Hs.90875 


NM_002871; RAB-interacting factor 


U77970_at 


Hs.321164 


NM_002518; neuronal PAS domain protein 2 NM_032235; 


U90549_at 


Hs.236774 


NM_006353; high mobility group nudeosomal binding domain 4 


X13334_at 


Hs.75627 


NM.000591; CD14 antigen precursor 


X54489_rna1_at 






X57579_s_at 






X64072_s_at 


Hs.375957 


NM_000211; integrin beta chain, beta 2 precursor 


X68194_at 


Hs.80919 


NM_006754; synaptophysin-like protein isoform a NM_182715; 
synaptophysin-like protein isoform b 


X73882_at 


Hs.254605 


NM_003980; microtubule-associated protein 7 


X78520_at 


Hs.372528 


NM_001829; chloride channel 3 


X95632_s_at 


Hs.387906 


NM_005759; abl-interactor 2 


Z29331_at 


Hs.372758 


NM_003344; ubiquitin-conjugating enzyme E2H isoform 1 
NM_1 82697; ubiquitin-conjugating enzyme E2H isoform 2 


Z48605_at 


Hs.421825 


NM_006903; inorganic pyrophosphatase 2 isoform 2 NM_1 76865; 
NM_1 76866; inorganic pyrophosphatase 2 isoform 3 NM_1 76867; 
inorganic pyrophosphatase 2 isoform 4 NM_1 76869; inorganic 
pyrophosphatase 2 isoform 1 


Z74615_at 


Hs.172928 


NM_000088; alpha 1 type I collagen preproprotein 



Recurrence predictor 

We furthermore tested an outcome predictor able to identify the likely presence or absence 
of recurrence in patients with superficial Ta tumours (see Table 16). 

5 

Table 16. Patient disease course information - recurrence vs. no recurrence 
From the hierarchical cluster analysis of the tumour samples we found that the tumours with 
a high recurrence frequency were separated from the tumours with low recurrence 
frequency. To study this further we profiled two groups of Ta tumours- 15 tumours with low 
10 recurrence frequency and 16 tumours with high recurrence frequency. To avoid influence 
from other tumour characteristics we only used tumours that showed the same growth 
pattern and tumours that showed no sign of concomitant carcinoma in situ. Furthermore, the 
tumours were all primary tumours. The tumours used for identifying genes differentially 
expressed in recurrent and non-recurrent tumours are listed in Table 16 below. 

15 

Table 16 Disease course information of all patients involved. 















A 


968-1 


Tagr2 


Papillary 


no 


27 month 


A 


928-1 


Ta gr2 


Papillary 


no 


38 month. 


A 


934-1 


Ta gr2 (220798) 


Papillary 


no 




A 


"70SM 


Tagr2 (210798) 


Papillary 


no 




A 


930-1 


Ta gr2 (300698) 


Papillary 


no 




A 


524-1 


Ta gr2 (201095) 


Papillary 


no 
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A 


455-1 


Ta gr2 (060695) 


Papillary 


no 


- 


A 


370-1 


Tagr2 (100195) 


Papillary 


no 


- 


A 


810-1 


Ta gr2 (031097) 


Papillary 


no 




A 


1146-1 


Tagr2 (231199) 


Papillary 


no 


- 


A 


1161-1 


Tagr2 (101299) 


Mixed 


no 




A 


1006-1 


Tagr2 (231198) 


Papillary 


no 


_ 


A 


942-1 


Tagr2 


Papillary 


no 


24 month. 


A 


1060-1 


Ta gr2 


Papillary 


no 


36 month. 


A 


1255-1 


Ta gr2 


Papillary 


no 


24 month. 


B 


441-1 


Ta gr2 


Papillary 


no 


6 month. 


B 


780-1 


Tagr2 


Papillary 


no 


2 month. 


B 


815-2 


Ta gr2 


Papillary 


no 


6 month. 


B 


829-1 


Tagr2 


Papillary 


no 


4 month. 


B 


861-1 


Tagr2 


Papillary 


no 


4 month. 


B 


925-1 


Ta gr2 


Papillary 


no 


5 month. 


B 


1008-1 


Ta gr2 


Papillary 


no 


5 month. 


B 


1086-1 


Tagr2 


Papillary 


no 


6 month. 


B 


1105-1 


Ta gr2 


Papillary 


no 


8 month. 


B 


1145-1 


Tagr2 


Papillary 


no 


4 month. 


B 


1327-1 


Tagr2 


Papillary 


no 


5 month. 


B 


1352-1 


Ta gr2 


Papillary 


no 


6 month. 


B 


1379-1 


Tagr2 


Papillary 


no 


5 month. 


B 


533-1 


Ta gr2 


Papillary 


no 


4 month. 


B 


679-1 


Tagr2 


Papillary 


no 


4 month. 


B 


"692^1 


Ta gr2 


Papillary 


no 


5 month. 



Group A: Primary tumours from patients with no recurrence of the disease for 2 years. 
Group B: Primary tumours from patients with recurrence of the disease within 8 months. 



5 Supervised learning prediction of recurrence 

In this part of the work we identified genes differentially expressed between non-recurring 
and recurring tumours. Cross-validation and prediction was performed as previously de- 
scribed, except that genes are selected based on the value of the Wilcoxon statistic for dif- 
ference between the two groups. 

0 

Prediction performance 

The prediction performance was tested using from 1-200 genes in the cross-validation loops. 
Figure 11 shows that the lowest error rate (8 errors) is obtained in e.g. the cross-validation 
model using from 39 genes. Based on this we selected this cross-validation model as our 
5 final predictor. The results of the predictions from the 39 gene cross-validation loops are 
listed in Table 17. The predictor misclassified four of the samples in each group and in one 
of the predictions the difference in the distances between the two group means is below the 
5% difference limit as described above. 
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The probability of misclassifying 8 or less arrays by a random classification is 0.0053. 

Table 17. Recurrence prediction results of 39 gene cross-validation loops. 
Group A: Primary tumours from patients with no recurrence of the disease for 2 years. Group 
B: Primary tumours from patients with recurrence of the disease within 8 months. Prediction, 
0=no recurrence, 1 ^recurrence. 















A 


968-1 


Ta gr2 


0 




0.19 


A 


928-1 


Ta gr2 


0 




0.49 


A 


934-1 


Ta gr2 (220798) 


0 




1.73 


A 


709-1 


Tagr2 (210798) 


0 




0.45 


A 


930-1 


Ta gr2 (300698) 


0 




0.82 


A 


524-1 


Ta gr2 (201095) 


0 




0.14 


A 


455-1 


Ta gr2 (060695) 


1 


1 * 


j 0.68 


A 


370-1 


Tagr2 (100195) 


0 




0.32 


A 


810-1 


Ta gr2 (031097) 


0 




0.45 


A 


1146-1 


Ta gr2 (231199) 


0 




0.98 


A 


1161-1 


Ta gr2 (101299) 


0 




0.03 


A 


1006-1 


Ta gr2 (231198) 


1 


• 


1.57 


A 


942-1 


Tagr2 


0 




0.31 


A 


1060-1 


Ta gr2 


1 


• 


0.81 


A 


1255-1 


Ta gr2 


1 


« 


0.71 


B 


441-1 


Ta gr2 


1 




1.03 


B 


780-1 


Ta gr2 


1 




0.37 f 


B 


815-2 


Ta gr2 


1 




0.35 


B 


829-1 


Tagr2 


1 




0.75 


B 


861-1 


Ta gr2 


0 


* 


2.55 j 


B 


925-1 


Tagr2 


1 




0.78 


B 


1008-1 


Tagr2 


0 


* 


0.12 


B 


1086-1 


Ta gr2 


0 


* 


0.51 


B 


1105-1 


Ta gr2 


1 




0.37 


B 


1145-1 


Tagr2 


1 




0.44 


B 


1327-1 


Ta gr2 


1 




1.96 


B 


1352-1 


Tagr2 


0 


• 


0.97 


B 


1379-1 


Ta gr2 


1 




0.67 


B 


533-1 


Ta gr2 


1 




0.31 


B 


679-1 


Ta gr2 


1 




0.82 | 


B 


692-1 


Tagr2 


1 j 




0.45 ; 



The optimal number of genes in cross-validation loops was found to be 39 (75% of the sam- 
ples were correct classified, p<0.006) and from this we selected those 26 genes that were 
used in at least 75% of the cross-validation loops to constitute our final recurrence predictor. 
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Consequently, this set of genes is to be used for predicting recurrence in independent sam- 
ples. We tested the strength of the predictive genes by permutation analysis, see Table 18. 
We selected the genes used in at least 29 of the 31 cross-validation loops to constitute our 
final recurrence prediction model. The expression pattern of those 26 genes is shown in fig. 
12. 



Table 18. The 26 genes that we find optimal for recurrence prediction. 













AF006041_at 


Hs.336916 


NM_001 350; death-associated protein 6 


31 


0.054 (161-7) 


D21337_at 


Hs.408 


NMJD01847; type IV alpha 6 collagen isoform A precursor 
NM__033641; type IV alpha 6 collagen Isoform B precursor 


31 


0.058(160-6) 


D49387_at 


Hs.294584 


NM_012212; NADP-dependent leukotriene B4 12- 
hydroxydehydrogenase 


31 


0.118(313-8) 


D64154_at 


Hs.90107 


NM_007002; adhesion regulating molecule 1 precursor 
NM_1 75573; adhesion regulating molecule 1 precursor 


31 


0.078 (165-9) 


D83780_at 


Hs.437991 


NM_014846; KIAA0196 gene product 


31 


0.094(159-4) 


D87258_at 


Hs.75111 


NM_002775; protease, serine, 11 


30 


0.112(168-11) 


D87437_at 


Hs.43660 


NM_J)14837; chromosome 1 open reading frame 16 


31 


0.058(160-6) 


HG1879-HT1919_at 






31 


0.122 (314-7) 


HG3076-HT3238_s_at 






31 


0.080(309-17) 


HG511-HT511_at 






31 


0.348(319-2) 


L34155_at 


Hs.83450 


NM_000227; laminin alpha 3 subunit precursor 


31 


0.122(314-7) 


L38928_at 


Hs.1 18131 


NM_006441; 5,10-methenyltetrahydrofolate synthetase (5- 
fonmyltetrahydrofolate cyclo-ligase) 


29 


0.348 (319-2) 


L49169_at 


Hs.75678 


NM_006732; FBJ murine osteosarcoma viral oncogene 
homolog B 


31 


0.108(155-2) 


M16938_s_at 


Hs.820 


NM_004503; homeo box C6 isoform 1 NM_1 53693; ho- 
meo box C6 Isoform 2 


29 


0.09 (170-16) 


M63175_at 


Hs.295137 


NM_001144; autocrine motility factor receptor isoform a 
NM_1 38958; autocrine motility factor receptor Isoform b 


29 


0.098 (308-18) 


M64572_at 


Hs.405666 


NM_002829; protein tyrosine phosphatase, non-receptor 
type 3 


31 


0.064 (305-31) 


M98528_at 


Hs.79404 


NM_014392; DNA segment on chromosome 4 (unique) 
234 expressed sequence 


31 


0.122 (314-7) 


U21858_at 


Hs.60679 


NM_003187; TBP-assoclated factor 9 NM_016283; adre- 
nal gland protein AD-004 


31 


0.122 (314-7) 


U45973_at 


Hs.178347 


NMJH6532; skeletal muscle and kidney enriched inositol 
phosphatase isoform 1 NfVM 30766; skeletal muscle and 
kidney enriched inositol phosphatase isoform 2 


31 


0.094 (310-14) 


U58516_at 


Hs.3745 


NM_005928; milk fat globule-EGF factor 8 protein 


29 


0.100(175-28) 


U62015_at 


Hs.8867 


NM_001554; cysteine-rich, angiogenic inducer, 61 


31 


0.106(169-13) 


U66702_at 


Hs.74624 


NM_002847; protein tyrosine phosphatase, receptor type, 
N polypeptide 2 isoform 1 precursor NfVM 30842; protein 


31 


0.146 (149-1) 
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tyrosine phosphatase, receptor type, N polypeptide 2 

phatase, receptor type, N polypeptide 2 isoform 3 precur- 
sor 






U70439_s_at 


Hs. 84264 


NMJ306401; acidic (leucine-rich) nuclear phosphoprotein 
32 family, member B 


30 


0.08 (309-17) 


U94855_at 


Hs. 38 1255 


NMJ)03754; eukaryotic translation initiation factor 3, 
subunit 5 epsilon, 47kDa 


30 


0.092(311-12) 


X63469_at 


Hs.77100 


NM_002095; general transcription factor HE, polypeptide 
2. beta 34kDa 


31 


0.092 (311-12) 


223064_at 


Hs.380118 


NM_002139; RNA binding motif protein, X chromosome 


30 


0.066 (307-24) 



Number: Number of times the gene has been used in a cross-validation loop. Test: The 
numbers in parenthesis are the value W of the Wilcoxon test statistic for no difference 
between the two groups together with the number N of genes for which the Wilcoxon test 
statistic is bigger than or equal to the value W. The test value is obtained from 500 
permutations of the arrays. In each permutation we form new pseudogroups where both of 
the pseudogroups have the same proportion of arrays from the two original groups. For each 
permutation we count the number of genes for which the Wilcoxon test statistic based on the 
pseudogroups is bigger than or equal to W, and the test value is the proportion of the 
permutations for which this number is bigger than or equal to N. Thus the test value 
measures the significance of the observed value W. Consequently, for most of our selected 
genes we only find as least as good predictive genes in about 10% of the formed 
pseudogroups. 

We present data on expression patterns that classify the benign and muscle-invasive blad- 
der carcinomas. Furthermore, we can identify subgroups of bladder cancer such as Ta tu- 
mours with surrounding CIS, Ta tumours with a high probability of progression as well as 
recurrence, and T2 tumours with squamous metaplasia. As a novel finding, the matrix re- 
modelling gene cluster was specifically expressed in the tumours having the worst progno- 
sis, namely the T2 tumours and tumours surrounded by CIS. For some of these genes new 
small molecule inhibitors already exist ( Kerr et al. 2002), and thus they form drug targets. At 
present it is not possible clinically to identify patients who will experience recurrence and not 
recurrenc, but it would be a great benefit to both the patients and the health system by re- 
ducing the number of unnecessary control examinations in bladder tumour patients. To de- 
termine the optimal gene-set for separating non-recurrent and recurrent tumours, we again 
applied a cross-validation scheme using from 1-200 genes. We determined the optimal 
number of genes in cross-validation loops to be 39 (75% of the samples were correct classi- 
fied, p<0.01, Figure 11) and from this we selected those 26 genes (Figure 12) that were 
used in at least 75% of the cross-validation loops to constitute our final recurrence predictor. 
Consequently, this set of genes is to be used for predicting recurrence in independent sam- 
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pies. We tested the strength of the predictive genes by performing 500 permutations of the 
arrays. This revealed that for most of our predictive genes we would only in a small number 
of the new pseudo-groups obtain at least as good predictors as in the real groups. 



5 Biological material 

66 bladder tumour biopsies were sampled from patients following removal of the necessary 
amount of tissue for routine pathology examination. The tumours were frozen immediately 
after surgery and stored at -80°C in a guanidinium thiocyanate solution. All tumours were 
graded according to Bergkvist et a/. 1965 and re-evaluated by a single pathologist. As nor- 

10 mal urothelial reference samples we used a pool of biopsies (from 37 patients) as well as 
three single bladder biopsies from patients with prostatic hyperplasia or urinary incontinence. 
Informed consent was obtained in all cases and protocols were approved by the local scien- 
tific ethical committee. 



1 5 RNA purification and cRNA preparation 

Total RNA was isolated from crude tumour biopsies using a Polytron homogenisator and the 
RNAzol B RNA isolation method (WAK-Chemie Medical GmbH). 10 \xg total RNA was used 
as starting material for the cDNA preparation. The first and second strand cDNA synthesis was 
performed using the Superscript Choice System (Life Technologies) according to the manu- 

20 facturers instructions except using an oligo-dT primer containing a T7 RNA polymerase pro- 
moter site. Labelled cRNA was prepared using the BioArray High Yield RNA Transcript Label- 
ling Kit (Enzo). Biotin labelled CTP and UTP (Enzo) were used in the reaction together with 
unlabeled NTP's. Following the IVT reaction, the unincorporated nucleotides were removed 
using RNeasy columns (Qiagen). 

25 

Array hybridisation and scanning 

15 ng of cRNA was fragmented at 94°C for 35 min in a fragmentation buffer containing 40 
mM Tris-acetate pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to hybridisation, the frag- 
mented cRNA in a 6xSSPE-T hybridisation buffer (1 M NaCI, 10 mM Tris pH 7.6, 0.005% 

30 Triton), was heated to 95°C for 5 min and subsequently to 45°C for 5 min before loading onto 
the Affymetrix probe array cartridge (HuGeneFL). The probe array was then incubated for 16 
h at 45°C at constant rotation (60 rpm). The washing and staining procedure was performed 
in the Affymetrix Fluidics Station. The probe array was exposed to 10 washes in 6xSSPE-T 
at 25°C followed by 4 washes in 0.5xSSPE-T at 50°C. The biotinylated cRNA was stained 

35 with a streptavidin-phycoerythrin conjugate, final concentration 2 ng/^l (Molecular Probes, 
Eugene, OR) in 6xSSPE-T for 30 min at 25°C followed by 10 washes in 6xSSPE-T at 25°C. 
The probe arrays were scanned at 560 nm using a confocal laser-scanning microscope 
(Hewlett Packard GeneArray Scanner G2500A). The readings from the quantitative scanning 
were analysed by the Affymetrix Gene Expression Analysis Software. An antibody amplifica- 
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tion step followed using normal goat IgG as blocking reagent, final concentration 0.1 mg/ml 
(Sigma) and biotinylated anti-streptavidin antibody (goat), final concentration 3 jag/ml (Vector 
Laboratories). This was followed by a staining step with a streptavidin-phycoerythrin conju- 
gate, final concentration 2 ^ig/^l (Molecular Probes, Eugene, OR) in 6xSSPE-T for 30 min at 
25°C and 10 washes in 6xSSPE-T at 25°C. The arrays were then subjected to a second 
scan under similar conditions as described above. 

Class discovery using hierarchical clustering 

All microarray results were scaled to a global intensity of 150 units using the Affymetrix Ge- 
neChip software. Other ways of array normalisation exist (Li and Hung 2001), however, us- 
ing the dCHIP approach did not change the expression profiles of the obtained classifier 
genes in this study (results not shown). For hierarchical cluster analysis and molecular classi- 
fication procedures we used expression level ratios between tumours and the normal urothe- 
lium reference pool calculated using the comparison analysis implemented in the Affymetrix 
GeneChip software. In order to avoid expression ratios based on saturated gene-probes, we 
used the antibody amplified expression-data for genes with a mean Average Difference 
value across all samples below 1000 and the non-amplified expression-data for genes with 
values equal to or above 1000 in mean Average Difference value across all samples. Con- 
sequently, gene expression levels across all samples were either from the amplified or the 
non-amplified expression-data. We applied different filtering criteria to the expression data in 
order to avoid including non-varying and very low expressed genes in the data analysis. 
Firstly, we selected only genes that showed significant changes in expression levels com- 
pared to the normal reference pool in at least three samples. Secondly, only genes with at 
least three "Present" calls across all samples were selected. Thirdly, we eliminated genes 
varying less than 2 standard deviations across all samples. The final gene-set contained 
1767 genes following filtering. Two-way hierarchical agglomerative cluster analysis was per- 
formed using the Cluster software 25 . We used average linkage clustering with a modified 
Pearson correlation as similarity metric. Genes and arrays were median centred and normal- 
ised to the magnitude of 1 prior to cluster analysis. The TreeView software was used for 
visualisation of the cluster analysis results (Eisen et al. 1998). Multidimensional scaling was 
performed on median centred and normalised data using an implementation in the SPSS 
statistical software package. 

Tumour stage classifier 

We based the classifier on the log-transformed expression level ratios. For these trans- 
formed values we used a normal distribution with the mean dependent on the gene and the 
group (Ta, T1, and T2, respectively) and the variance dependent on the gene only. For each 
gene we calculated the variation within the groups (W) and the three variations between two 
groups (B(Ta/T1), B(Ta/T2), B(T1/T2)) and used the three ratios B/W to select genes. We 
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selected those genes having a high value of B(Ta/T1)/W, those genes having a high value 
of B(T a/T2)/W, and those genes with a high value of B(T1/T2)/W. To classify a sample, we 
calculated the sum over the genes of the squared distance from the sample value to the 
group mean, standardised by the variance. Thus, we got a distance to each of the three 
groups and the sample was classified as belonging to the group in which the distance was 
smallest. When calculating these distances the group means and the variances were esti- 
mated from all the samples in the training set excluding the sample being classified. 

Recurrence prediction using a supervised learning method 

Average Difference values were generated using the Affymetrix GeneChip software and all 
values below 20 were set to 20 to avoid very low and negative numbers. We only included 
genes that had a "Present" call in at least 7 samples and genes that showed intensity varia- 
tion (Max-Min>100, Max/Min>2). The values were log transformed and rescaled. We used a 
supervised learning method essentially as described ( Shipp et al. 2002). Genes were se- 
lected using t-test statistics and cross-validation and sample classification was performed as 
described above. 

Immunohistochemistry 

Tumour tissue microarrays were prepared essentially as described (Kononen et al. 1998), 
with four representative 0.6 mm paraffin cores from each study case. Immunohistochemical 
staining was performed using standard highly sensitive techniques after appropriate heat- 
induced antigen retrieval. Primary polyclonal goat antibodies against Smad 6 (S-20) and 
cyclin G2 (N-19) were from Santa Cruz Biotechnology. Antibodies to p53 (monoclonal DO-7) 
and Her-2 (polyclonal anti-c-erbB-2) were from Dako A/S. Ki-67 monoclonal antibody (MIBI) 
was from Novocastra Laboratories Ltd. Staining intensity was scored at four levels, Nega- 
tive, Weak, Moderate and Strong by an experienced pathologist who considered both colour 
intensity and number of stained cells, and who was unaware of array results. 

EXAMPLE 3 

A molecular classifier detects carcinoma In situ expression signatures in tumors and 
normal urothelium of the bladder. 

Clinical samples 

Bladder tumour samples were obtained directly from surgery following removal of tissue for 
routine pathological examination. The samples were immediately submerged in a guadinium 
thiocyanate solution for RNA preservation and stored at -80° C. Informed consent was 
obtained in all cases, and the protocols were approved by the scientific ethical committee of 
Aarhus County. Samples in the No-CIS group were selected based on the following criteria: 
a) Ta tumours with no CIS in selected site biopsies in all visits; b) no previous muscle 
invasive tumour. Samples in the CIS group were selected based on the criteria: a) Ta or T1 
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tumours with CIS in selected site biopsies in any visit (preferable Ta tumours with CIS in the 
sampling visit); b) no previous muscle invasive tumours. Normal biopsies were obtained from 
individuals with prostatic hyperplasia or urinary incontinence. CIS and "normal" biopsies 
were obtained from cystectomy specimens directly following removal of the bladder. A grid 
5 was placed in the bladder for orientation and biopsies were taken from 8 positions covering 
the bladder surface. At each position, three biopsies were taken - two for pathologic 
examination and one in between these for RNA extraction for microarray expression 
profiling. The samples for RNA extraction were immediately transferred to the guadinium 
thiocyanate solution and stored at -80° C until use. Samples used for RNA extraction were 
10 assumed to have CIS if CIS was detected in both adjacent biopsies. The "normal" samples 
were assumed to be normal if both adjacent biopsies were normal. 

CRN A preparation, array hybridisation and scanning 

Purification of total RNA, preparation of cRNA from cDNA and hybridisation and scanning 
15 were performed as previously described (Dyrskjot et al. 2003). The labelled samples were 
hybridised to Affymetrix U133A GeneChips. 

Expression data analysis 

Following scanning all data were normalised using the RMA normalisation approach in the 
20 Bioconductor Affy package to R. Variation filters were applied to the data to eliminate non- 
varying and presumably non-expressed genes. For gene-set 1 this was done by only 
including genes with a minimum expression above 200 in at least 5 samples and genes with 
max/min expression intensities above or equal to 3. The filtering for gene-set 2 including only 
genes with a minimum expression of 200 in at least 3 samples and genes with max/min 
25 expression intensities above or equal to 3. Average linkage hierarchical cluster analysis was 
carried out using the Cluster software with a modified Pearson correlation as similarity metric 
(Eisen et al. 1998). We used the TreeView software for visualisation of the cluster analysis 
results (Eisen et al. 1998). Genes were log-transformed, median centred and normalised to 
the magnitude of 1 before clustering. We used GeneCluster 2.0 ( http://www- 
30 Qenome.wi.mit.edu/cancer/software/aenecluster2/Qc2.htmn for the supervised selection of 
markers and for permutation testing. The algorithms used in the software are based on 
(Golub et al. 1999, Tamayo et al. 1999). Classifiers for CIS detection were built using the 
same methods as described previously (Dyrskjot et al. 2003). 

35 Gene expression profiling 

We used high-density oligonucleotide microar/ays for gene expression profiling of 
approximately 22,000 genes in 28 superficial bladder tumour biopsies (13 tumours with 
surrounding CIS and 15 without surrounding CIS) and in 13 invasive carcinomas. See table 
19 for patient disease course descriptions. Furthermore, expression profiles were obtained 
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from 9 normal biopsies and from 10 biopsies from cystectomy specimens (5 histologically 
normal biopsies and 5 biopsies with CIS). 



Table 19 Clinical data on patient disease courses and results of molecular CIS classification 
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Previous 


Tumour 
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No CIS 




1146-1 




Ta gr2 




No 


No CIS 




1216-1 




Ta gr2 




|LI_ 

NO 


No CIS 
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a The tumour groups involved were TCC without CIS (1), TCC with CIS (2) and invasive TCC 



b The numbers indicate the patient number followed by the clinic visit number. 

C CIS in selected site biopsies in previous, present or subsequent visits to the clinic. ND: not 



Hierarchical cluster analysis 

Following appropriate normalisation and expression intensity calculations we selected those 

10 genes that showed high variation across the 41 TCC samples for further analysis. The 
filtering produced a gene-set consisting of 5,491 genes (gene-set 1) and two-way 
hierarchical cluster analysis was performed based on this gene-set. The sample clustering 
showed a separation of the three groups of samples with only few exceptions (Fig. 14a). 
Superficial TCC with surrounding CIS clustered in the one main branch of the dendrogram, 

15 while the superficial TCC without CIS and the invasive TCC clustered in two separate sub- 
branches in the other main branch of the dendrogram. The only exceptions were that the 
invasive TCC samples 1044-1 and 1124-1 clustered in the CIS group and two TCC with CIS 
clustered in the invasive group (samples 1330-1 and 956-2). The only TCC without CIS that 
clustered in the CIS group was sample 1482-1. The distinct clustering of the tumour groups 

20 indicated a large difference in gene expression patterns. 

Hierarchical clustering of the genes (Fig. 14c) identified large clusters of genes characteristic 
for the each tumour phenotype. Cluster 1 showed a cluster of genes down-regulated in 
cystectomy biopsies, TCC with adjacent CIS and in some invasive carcinomas (Fig. 14c). 
There is no obvious functional relationship between the genes in this cluster. Cluster 2 

25 showed a tight cluster of genes related to immunology and cluster 3 contained mostly genes 
expressed in muscle and connective tissue. Expression of genes in this cluster was 
observed in the normal and cystectomy samples, in a fraction of the TCC with CIS and in the 
invasive tumours. Cluster 4 contained genes up-regulated in the cystectomy biopsies, TCC 
with adjacent CIS and in invasive carcinomas (Fig. 14c). This cluster includes genes 

30 involved in cell cycle regulation, cell proliferation and apoptosis. However, for most of the 
genes in this cluster there is not apparent functional relationship either. Comparisons of 
chromosomal location of the genes in the clusters revealed no correlation between the 
observed gene clusters and chromosomal position of the identified genes. A positive 
correlation could have indicated chromosomal loss or gain or chromosomal inactivation by 

35 e.g. methylation of common promoter regions. 

To analyse the impact of surrounding CIS lesions further we used the 28 superficial tumours 
only, and created a new gene set consisting of 5,252 varying genes (gene-set 2). 
Hierarchical cluster analysis of the tumour samples (Figure 13b) based on the new gene-set 
separated the samples according to the presence of CIS in the surrounding urothelium with 



(3). 



5 



determined. 

d Molecular classification of the samples using 25 genes in cross-validation loops. 
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only 1 exception (P< 0.000001, x 2 -test). Sample 1482-1 clustered in the TCC with CIS 
group, however, no CIS has been detected in selected site biopsies during routine 
examinations of this patient. Tumour samples 1182-1 and 1093-1 did not have CIS in 
selected site biopsies in the same visit as the profiled tumour but showed this in later visits. 
However, the profile of these two superficial tumour samples already showed the adjacent 
CIS profile. 



Marker selection 

To delineate the tumours with surrounding CIS from the tumours without CIS we used t-test 
10 statistics to select the 50 most up-regulated genes in each group (Figure 15a). Permutation 
of the sample labels 500 times revealed that the 50 genes up-regulated in the CIS-group are 
highly significant differentially expressed and unlikely to find by chance, as all markers were 
significant on a 5% confidence level. Consequently, in 500 random datasets it was only 
possible to select as good genes in less than 5% of the datasets. The 50 genes up-regulated 
15 in the no-CIS group showed a poorer performance in the permutation tests, as these were 
not significant on a 5% confidence level. See Table 20 for details. The relative expression of 
these 100 genes is 9 normal and 10 biopsies from cystectomies with CIS are shown in figure 
15b. The no-CIS profile was found in all of the normal samples. However, all histologically 
normal samples adjacent to the CIS lesions as well as the CIS biopsies showed the CIS 
20 profile. 



Table 20.The best 100 markers 
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no_CIS 
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protein 1 
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3.98 
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NM_001 91 0; cathepsin E iso- 
form a preproprotein 
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4.03 
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NM_007193; annexin A10 
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3.15 


3.87 


3.51 
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NM_001958; eukaryotic transla- 
tion elongation factor 1 alpha 2 


214599_at 


no_CIS 


3.02 
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3.14 
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no_CIS 


2.84 


3.63 


3.20 


3.00 


Hs.76422 


NM_000300; phospholipase A2, 
group IIA (platelets, synovial 
fluid) 


203980_at 


no_CIS 


2.74 


3.47 


3.12 


2.89 


Hs.391561 


NM_001442; fatty acid binding 
protein 4. adipocyte 


209270_at 


no_CIS 


2.39 


3.38 


3.10 


2.85 


Hs.436983 


NM_000228; laminin subunit 
beta 3 precursor 


206658_at 


no_CIS 


2.35 


3.37 


3.05 


2.78 


Hs.284211 


NMJ>30570; uroplakin 3B iso- 
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form a NIVM 82683; uroplakin 
3B isoform c NNM 82684; uro- 
plakin 3B isoform b 


22077Q at 


no__uio 


2.35 


3.33 


2.97 


2.73 


Hs. 1491 95 


NMJH6233; peptidylarginine 
ueiminase type III 


216971 s at 


np__v^io 


2.28 


3.29 


2.91 


2.71 


Hs.79706 


NM_000445; plectin 1, interme- 
diate filament binding protein 
500kDa 


206191 aft 

&w i^i at 




2.25 


3.24 


2.86 


2.68 


Hs.47042 


NMJJ01248; ectonudeosfde 
triphosphate dtphosphohy- 
drolase 3 


218484_at 


no_CIS 


2.18 


3.20 


2.81 


2.62 


Hs.221447 


NM_020142; NADHrubiquinone " 
oxidoreductase MLRQ subunit 
homolog 


221854_at 


no_CIS 


2.1 


3.19 


2.80 


2.60 


Hs. 3 13068 


NM 000299; plakophilin 1 


203792_x_at 


no_CIS 


2.02 


3.16 


2.74 


2.55 


Hs.371617 


NM_007144; ring finger protein 
110 


207862_at 


no_CIS 


2.01 


3.16 


2.72 


2.52 


Hs.379613 ~~ 


NM 006760; uroplakin 2 | 


218960_at 


no_CIS 


1,93 


3.14 


2.65 


2.47 


Hs.414005 


NM_0 19894; transmembrane 
protease, serine 4 isoform 1 
NM_183247; transmembrane 
protease, serine 4 isoform 2 


2030O9_at 


no_CIS 


1.93 


3.12 


2.62 


2.45 


Hs. 155048 


NM_005581; Lutheran blood 
group (Auberger b antigen 
included) 


204508_s_at 




l.OO 


3.10 


2.60 


2.42 


Hs.279916 


IMMJH7689; hypothetical pro- 
tein FU20151 


211692_s_at 
206465 at ~~ 


no_CIS 


1.87 
1.86 


3.06 
3.04 


2.58 
2.54 


2.39 
2.38 


Hs.87246 
Hs.277543 


NMJ)14417; BCL2 binding 
component 3 
NM_015162;lipidosin 


206122_at 
206393__at 


no_CIS 
no_CIS 


1.85 
1.83 


2.92 
2.89 


2.52 
2.49 


2.36 
2.33 


Hs.95582 
Hs.83760 


NM_006942; SRY-box 15 
NMJJ03282; troponin I, skeletal, 
fast 


214639_s_at 


no_CIS 


1.79 


2.87 


2.49 


2.30 


Hs.67397 


NM_005522; homeobox A1 
protein isoform a NM_1 53620; 
homeobox A1 protein isoform b 


214630_at 


no_CIS 


1.79 


2.84 


2.44 


2.28 


Hs. 184927 


NMJ)00497; cytochrome P450, 
subfamily XIB (steroid 1 1-beta- 
hydroxylase), polypeptide 1 
precursor 


204465_s_at 
204990_s_at 


no_CIS 
no__CIS 


1.77 
1.76 


2.81 
2.79 


2.42 
2.41 


2.27 
2.24 


Hs.76888 
Hs.85266 


NM_uu4b9Z; NM_ 032727; 
intemexin neuronal intermediate 
filament protein, alpha 
NM_000213; integrin, beta 4 


205453_at 
215812_s_at 


no_CIS 
no_CIS 


1.75 
1.74 


2.77 
2.77 


2.39 
2.37 


2.22 
2.20 


Hs.290432 
Hs.499113 


NMJJ02145; homeo box B2 
NMJJ18058; cartilage acidic 
protein 1 
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217040 x at 



203759 at 



211002_s_at 



216641_s_at 



221660 at 



220026 at 



209591 s at 



219922_s at 



201641 at 



204952 at 



no CIS 



1.74 



2.75 



no_CIS I 1/73 



no_CIS 1.73 



novels 



1.73 



no_CIS 



1.71 



no_CIS 



1.71 



no_CIS 



no_CIS 1.68 



no CIS 



1.67 



no_CIS 



1.66 



204487 s at 



210761 s at 



217626_at 



204380_s_at 
205455_at 



205073 at 



203287 at 



210735 s at 



203842_s__at 
206561_s_at 



214752_x__at 



no_CIS 



1.65 



no CIS 



no CIS 



1.64 
1.63 



2.75 



2.74 



2.73 



2.67 



2.66 



2.63 



2.61 



2.61 



2.59 



2.59 



no_CIS 
no_CIS 



1.62 
T61 



no_CIS 



1.61 



no_CIS 



1.61 



no_CIS 



1.58 



no_CIS 
no_CIS 



no_CIS 



1.57 
T57 
T56 



2.59 
"2^8 



2.58 
~Z58 



2.36 



2.34 



2.18 



2.17 



2.33 



2.17 



2.31 



2.15 



2.30 2.13 



2.28 2.13 



2.28 2.11 



2.26 2.08 



2.26 2.07 



2.24 



2.07 



2.23 



2.06 



Hs.95582 



Hs.75268 



Hs.82237 



Hs.18141 



Hs.247831 



Hs.227059 



Hs.170195 



Hs.289019 



Hs.118110 



Hs.377028 



2.23 I 2.05 
2.21 



2.04 



2.03 
"Z02 



2.58 



2.58 



2.55 



2.54 
2.53 



2.52 



2.17 



2.01 



2.16 



2.00 



2.15 



1.99 



2.15 I 1.97 
Tl4| 1.96 



2.13 



1.95 



Hs.367809 



Hs.86859 
Hs.201967 



Hs.1420 
Hs.2942 



Hs.1 52096 



Hs.18141 



Hs.5338 



Hs.172740 
Hs. 116724" 



Hs. 195464 



NM_001910; cathepsln E iso-" 
form a preproprotein 
NMJ48964; cathepsln E iso- 
form b preproprotein 



NMJ)07193; annexin A10 



NM_001958; eukaryoUc transla- 
tion elongation factor 1 alpha 2 



NMJ)05547; involucrin 



NM_000300; phospholipase A2, 
group IIA (platelets, synovial 
fluid) 



NIW 001442; fatty acid binding 
protein 4, adipocyte 



NM_000228; laminin subunit 
beta 3 precursor 



NM.030570; uroplakin 3B iso- 
form a NMjl 82683; uroplakin 
3B isoform c NMJ 82684; uro- 
plakin 3B isoform b 



NMJM6233; peptidylarginine 
deaminase type ill 



NM_000445; plectin 1. interme- 
diate filament binding protein 
500kDa 



NMJ)01248; ectonucleoside 
triphosphate diphosphohy- 
drolase 3 



NMJ)20142; NADH:ubiquinone 
oxidoreductase MLRQ subunit 
homolog 



NM_000299; plakophiiin 1 



NM_007144; ring finger protein 
110 

NM_006760; uroplakin 2 



NMJ) 19894; transmembrane 
protease, serine 4 isoform 1 
NMJ 83247; transmembrane 
protease, serine 4 isoform 2 



NMJ)05581; Lutheran blood 
group (Auberger b antigen 
included) 



NM_017689; hypothetical pro- 
tein FU20151 



NM_014417; BCL2 binding 
component 3 
NM_015162; lipidosin 



NMJ)06942; SRY-box 15 
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217028_at 



CIS 



213975_s_at CIS 



201859 at 



CIS 



219410_at I CIS 

207173_x__at CIS 

214651_s_at CIS 

201858_s_at CIS 



211430_s_at CIS 
213891_s__at CIS 



221872_at 



CIS 



212386 at 



CIS 



211161_s_at CIS 
214669_x_at CIS 
217388_s_at CIS 



203477_at 



CIS 



204688 at 



129 



CIS 



218718 at 



CIS 



215176_x_at CIS 



20!842_s_at CIS 



212667_at 



CIS 



4.87 



5.17 



4.65 



4.43 



4.59 



4.49 



4,37 
Tl4 

4.06 



4.15 



3.98 



4.67 4.40 



4.01 



3.76 



3.70 3.45 



Hs.421986 



Hs.234734 



Hs.1908 



4.03 



3.86 



3.88 
T83 

3.78 



3.82 
~3^7 



3.76 
3.55 
3.44 



3.36 



3.35 



3.35 



3.32 



3.31 

IT 



3.63 



3.52 



3.50 



3.42 
3.36 
3.31 



3.49 3.29 



3.33j 37ST 
3.22 I 2. 

3.09 I 2.91 



3.05 

ToT 



2.83 
T77 



Hs. 104800 
Hs.443435 



Hs. 127428 
Hs.1908 



2.89 



2.73 



2.87 



2.69 



2.84 | 2.65 
2.80 | 2.62 
2.79 I 2.58 



Hs.413826 
Hs.359289 



Hs.82547 



Hs.359289 



3.28 



3.26 



3.22 



3.14 



3.11 



2.75 



2.56 



2.74 2.52 



2.70 



2.67 



2.48 



2.45 



Hs.377975 
Hs.444471 



Hs.409034 



Hs.409798 



Hs.43080 



Hs.503443 



2.65 
~2j6T 



2.44 
"2i42 



Hs.76224 
Hs.1 11779 



NIW 003282; troponin \, skeletal" 
fast 



NM_005522; homeobox A1 " 
protein Isoform a NM_1 53620; 
homeobox A1 protein isoform b 



NM_000497; cytochrome P450, 
subfamily XIB (steroid 11-beta- 
hydroxylase). polypeptide 1 
precursor 



NM__004692; NM_J)32727; 

intemexin neuronal Intermediate 
filament protein, alpha 



NM_000213; integrin, beta 4 
NM_002145; homeo box B2 " 
NM_018058: cartilage acidic 
protein 1 



NMJ501910; cathepsin E iso- 
form a preproprotein 
NIVM 48964; cathepsin E iso- 
form b preproprotein 



NM_007193; annexin A10 



NM_001958; eukaryotic transla- 
tion elongation factor 1 alpha 2 



NM_0U5547; involucrin 
NM_00030o; phospholipase A2, 
group HA (platelets, synovial 
fluid) 

NMJ)01442; fatty acid binding' 
protein 4, adipocyte 
NM_000228; laminin subunit 
beta 3 precursor 



NM_030570; uroplakin 3B Iso- 
form a NMJ 82683; uroplakin 
3B isoform c NM_1 82684; uro- 
plakin 3B isoform b 



NM_016233; peptidylarginine 
deiminase type 111 



NM_000445; plectin 1, interme- 
diate filament binding protein 
500kDa 

NM_001248; ectonucleoslde 
triphosphate diphosphohy- 
drolase 3 



NM 020142; NADH:ubiquinone 
oxidoreductase MLRQ subunit 
homolog 

NMJJ00299; plakophilin 1 
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209340 at 



215379__x_at 



200762 at 



211896__s_at 



204141 at 



201744_s at 



209138_x_at 
214677_x_at 



CIS 



CIS 



CIS 



CIS 



CIS 



212077 at 



CIS 

CIS" 

CIS" 



CIS 



130 



3.27 



3.26 



3.25 



3.21 



3.10 



3.10 



3.05 



3.19 



3.05 



3.18 



3.17 
Tl4 



3.11 



3.05 



2.61 



2.59 



2.56 



2.53 



2.39 



2.39 



2.34 



2.32 



Hs.21293 
Hs.449601 



Hs.173381 



Hs. 15631 6 



2.53 



3.03 



3.03 
T02 



2.99 



2.50 



2.47 



2.46 



2.28 Hs.300701 



2.27 Hs.406475 



2.24 I Hs.505407 
2.23 Hs.449601 



2.21 Hs.443811 



NM_007144; ring finger protein 
110 



NMJM)6760; uroplakin 2 



NM_019894; transmembrane 
protease, serine 4 isoform 1 
NM_1 83247; transmembrane 
protease, serine 4 isoform 2 



NM_005581; Lutheran blood 
group (Auberger b antigen 
included) 



NM_017689; hypothetical pro- 
tein FU20151 



NM.014417; BCL2 binding" 
component 3 



NM_015162; lipidosin 
NM_006942; SRY-box 15 



NMJ503282; troponin I, skeletal, 
fast 



206392 s at 



CIS 



212998 x at 



CIS 



201616_s at 



CIS 



215121 x at 



CIS 
CIS" 

CIS 



202917_s_at 



201560 at 



218918_at 



218656_s_at 



201088 at 



CIS 
CIS" 

CIS 



CIS 



CIS 



CIS 



CIS 



3.11 



2.98 



2.43 



2.20 Hs.82547 



3.09 



2.94 



2.40 



2.19 Hs.375115 



3.08 



2.93 



2.38 



3.07 
3.06 



2.88 
T85 

2.84 



3.05 
T03 



2.83 



2.79 
T79 



2.99 



2.77 



2.99 



2.37 
"Z35 



2.34 



2.18 Hs.443811 
2.15 Hs.155597 



2.14 Hs.387679 



2.13 Hs.356861 



2.33 
T32 

2.31 



2.11 Hs. 170328 
2.10 Hs.17109 



2.08 Hs.416073 
2.08 Hs.25035 



2.29 



2.06 Hs.8910 



2.76 2.27 



2.06 Hs.93765 



2.99 



2.76 



2.26 



2.04 Hs.159557 



NM_005522; homeobox A1 
protein isoform a NM_1 53620; 
homeobox A1 protein isoform b 



NM_000497; cytochrome P450, 
subfamily XIB (steroid 11-beta- 
hydroxylase), polypeptide 1 
precursor 



NM_004692; NM_032727; 
Intemexin neuronal intermediate 
filament protein, alpha 



NM_000213; Integrin, beta 4 
NMJ)02145; homeo box B2 
NM_018058; cartilage acidic 
protein 1 



NM_001910; cathepsin E iso- 
form a preproprotein 
NM_148964; cathepsin E iso- 
form b preproprotein 
NM_007193; annexin A10 
NM_001958; eukaryotic transla- 
tfon elongation factor 1 alpha 2 



NM_005547; involucrin 



NMJ)00300; phospholipase A2, 
group HA (platelets, synovial 
fluid) 



NM_001442; fatty acid binding 
protein 4, adipocyte 



NM_000228; laminin subunit 
beta 3 precursor 
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201291_s_at 


CIS 


2.97 


2.75 


2.25 


2.04 


Hs. 156346 


NM_030570; uroplakln 3B Iso- 
form a NMjl 82683; uroplakln 
3B isoform c NM_1 82684; uro- 
plakln 3B isoform b 


215076_s_at 


CIS 


2.95 


2.72 


2.24 


2.03 


Hs.443625 


NM.016233; peptidylarginlne 
deiminase type III 


212195at 


CIS 


2.94 


2.71 


2.22 


2.02 


Hs.71968 


NM_000445; plectin 1. interme- 
diate filament binding protein 
SOOkDa 


209732_at 


CIS 


2.94 


2.68 


2.22 


2.00 


Hs.85201 


NM_001248; ectonudeoside 
triphosphate diphosphohy- 
drolase 3 


212192_at 


CIS 


2.94 


2.67 


2.22 


1.99 


Hs.109438 


NM_020142; NADHrubiquinone 
oxidoreductase MLRQ subunit 
homolog 


221671_x_at 


CIS 


2.92 


2.67 


2.20 


1.98 


Hs.377975 


NM_000299; plakophilin 1 


211671_s_at 


CIS 


2.91 


2.66 


2.20 


1.98 


Hs. 126608 


NM_007144; ring finger protein 
110 


214352_s_at 


CIS 


2.86 


2.66 


2.19 


1.97 


Hs.412107 


NM_006760; uroplakln 2 



Feature: Probe-set on U133A GeneChip 
Class: The group in which the marker is up-regulated 
T-test: The t-test value 
5 Perm 1 %: The 1 % permutation level 
Perm 5%: The 5% permutation level 
Perm 10%: The 10% permutation level 

Construction of a molecular CIS classifier 
10 A classifier able to diagnose CIS from gene expressions in TCC or in bladder biopsies may 
increase the detection rate of CIS. Our first approach was to be able to classify superficial 
TCC with or without CIS in the surrounding mucosa. This could have the diverse effect that 
the number of random biopsies to be taken could be reduced. 

We build a CIS-classifier as previously described (Dyrskjot et al. 2003) using cross-validation 
1 5 for determining the optimal number of genes for classifying CIS with fewest errors. The best 
classifier performance (1 error) was obtained in cross-validation loops using 25 genes (see 
figure 16); 16 of these were included in 70% of the cross-validation loops and these were 
selected to represent our final classifier for CIS diagnosis (Fig. 17a and table 21). 
Permutation analysis shoved that 13 of these were significant at a 1% confidence level - the 
20 remaining three genes were above a 10% confidence level. 
Table 21 . The 16 gene molecular classifier of CIS 
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213633_a 
t 



no_CIS 



1.51 



2.46 



2.04 



1.85 



Hs.97858 



NM_018957; SH3-domain 
binding protein 1 



212784_a 
t 



no CIS 



1.36 



2.27 



1.70 



Hs.388236 



NMJ)15125; capicua 
homoJog 



209241_x 
-at 



no CIS 



1.13 



1.78 



1.48 



1.33 



Hs. 11 2028 



217941_s 
at 



CIS 



2.3 



1.96 



1.66 



1.47 



Hs.8117 



NM_015716; mis- 
shapen/NIK-related ki- 
nase isoform 1 
NM_1 53827; mis- 
shapen/NIK-related ki- 
nase isoform 3 
NM_1 70663; mis- 
shapen/NIK-related ki- 
nase isoform 2 



NM_018695; erbb2 inter- 
acting protein 



201877_s 
at 



CIS 



2.27 



1.90 



1.62 



1.45 



Hs.249955 



NM_002719; gamma 
isoform of regulatory 
subunit B56, protein 
phosphatase 2A isoform a 
NM_1 78586; gamma 
isoform of regulatory 
subunit B56, protein 
phosphatase 2A isoform b 
NM_1 78587; gamma 
Isoform of regulatory 
subunit B56, protein 
phosphatase 2A isoform c 
NIVM78588; gamma 
isoform of regulatory 
subunit B56, protein 
phosphatase 2A isoform d 



209630_s 
at 



CIS 



1.97 



1.54 



1.31 



1.15 



Hs.444354 



NMJ)12164; F-box and 
WD-40 domain protein 2 



202777_a 
t 



CIS 



1.93 



1.51 



1.29 



1.12 



Hs. 10431 5 



200958_s 
at 



NMJJ07373; soc-2 sup- 
pressor of clear homolog 



CIS 



1.92 



1.49 



1.28 



1.11 



Hs. 164067 



NM_005625; syndecan 
binding protein (syntenin) 



209579_s 
_at 



CIS 



1.79 



1.36 



1.16 



1.01 



Hs.35947 



NM.003925; methyl-CpG 
binding domain protein 4 



209004_s 
at 



CIS 



1.63 



1.21 



1.00 



0.89 



Hs.5548 



NM_012161; F-box and 
leucine-rich repeat protein 
5 Isoform 1 NM_033535; 
F-box and leucine-rich 
repeat protein 5 isoform 2 



218150_a 
t 



CIS 



1.6 



1.18 



0.98 



0.86 



Hs.342849 



NM_012097; ADP- 
ribosylation factor-like 5 
isoform 1 NM_1 77985; 
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ADP-ribosylation factor- 
like 5 isoform 2 


202076_a 
t 


CIS 


1.53 


1.12 


0.92 


0.82 


Hs.289107 


NM_001166; baculoviral 
IAP repeat-containing 
protein 2 


204640.S 
_at 


CIS 


1.45 


1.03 


0.83 


0.75 


Hs.129951 


NM_003563; speckle-type 
POZ protein 


201887_a 
t 


CIS 


1.32 


0.92 


0.74 


0.66 


Hs.285115 


NM_001560; interleuki~ 
13 receptor, alpha 1 
precursor 


212802_s 
.at 


CIS 


1.31 


0.91 


0.72 


0.65 


Hs.287266 




212899_a 
t 


CIS 


1.29 


0.89 


0.71 


0.64 


Hs.129836 


NM_015076; cyclin- 
dependent kinase (CDC2- 
like) 11 



Feature: Probe-set on U133AGeneChip 
Class: The group in which the marker is up-regulated 
T-test: The t-test value 
5 Perm 1%: The 1% permutation level 
Perm 5%: The 5% permutation level 
Perm 10%: The 10% permutation level 

Exploration of strength of CIS classifier 

10 To further explore the strength of classifying CIS we also built a classifier by randomly 
selecting half of the samples for training and used the other half for testing. Cross validation 
was used again in the training of this classifier for optimisation of the gene-set for classifying 
independent samples. Cross-validation with 15 genes showed a good performance (see 
figure 18) and 7 of these genes were included in 70% of the class-validation loops. These 7 

15 genes classified the samples in the test set with one error only - sample 1482-1 (x 2 -test, 
P<0.002). Only two of the genes were also included in the 16-gene classifier, which is 
understandable considering the number of tests performed and the limitations in sample 
size. This classification performance is notable considering the small number of samples 
used for training the classifier. 

20 

Grouping of normal and cystectomies with CIS 

We used hirarchichal cluster analysis to group the 9 normal and 10 biopsies from 
cystectomies with CIS based on the normalised expression profiles of the 16 classifier genes 
(Fig. 17b). This clustering separated the samples from cystectomies with CIS lesions from 
25 the normal samples with only few exceptions as 8 of the 10 biopsies from cystectomies were 
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found in the one main branch of the dendrogram and 8 of the 9 normal biopsies were found 
on the other main branch (x 2 -test, P<0.002). 



5 
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Claims 



10 



15 



20 



25 



30 



1 . A method of predicting the prognosis of a biological condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue and/or expression prod- 
ucts from the cells, 

determining an expression level of at least one gene in the sample, said gene being se- 
lected from the group of genes consisting of gene No. 1 to gene No. 562, 

correlating the expression level to at least one standard expression level to predict the 
prognosis of the biological condition in the animal tissue. 

2. The method of claim 1 , wherein the animal tissue is selected from body organs. 

3. The method of claim 2, wherein the animal tissue is selected from epithelial tissue in 
body organs. 

4. The method of claim 3, wherein the animal tissue is selected from epithelial tissue in the 
urinary bladder. 

5. The method according to claim 4, wherein the stage is selected from bladder cancer 
stages Ta, Carcinoma in situ (CIS), T1 , T2, T3 and T4. 

6. The method according to claim 5, comprising determining at least the expression of a Ta 
stage gene from a Ta stage gene group, at least one T1 stage gene from a T1 stage 
gene group, at least a T2 stage gene from a T2 stage gene group, at least a T3 stage 
gene from a T3 stage gene group, at least a T4 stage gene group from a T4 stage gene 
group, wherein at least one gene from each gene group is expressed in a significantly 
different amount in that stage than in one of the other stages. 

7. The method according to claim 4, 5 or 6, wherein the stage is bladder cancer stage Ta. 

8. The method according to claim 4, wherein the animal tissue is mucosa. 



9. 



The method of any of the preceding claims, wherein the biological condition is an adeno- 
carcinoma, a carcinoma, a teratoma, a sarcoma, and/or a lymphoma and/or carcinoma- 
in-situ, and/or dysplasia-in-situ. 
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10. The method of any of the preceding claims, wherein the sample is a biopsy of the tissue 
or of metastasis originating from said tissue. 

11. The method according to any of the preceding claim 1-6, wherein the sample is a cell 
5 suspension made from the tissue. 

12. The method according to any of the preceding claims, wherein the sample comprises 
substantially only cells from said tissue. 

10 13. The method according to claim 9, wherein the sample comprises substantially only cells 
from mucosa or tumors derived from said mucosa cells. 

14. The method according to any of the preceding claims, wherein the gene from the group 
of genes is selected individually from gene No. 1 to gene No. 188 (stages). 

15 

15. The method according to any of the preceding claims 1-13, wherein the gene from the 
group of genes is selected individually from gene No. 189 to gene No. 214 (recurrence). 

16. The method according to any of the preceding claims 1-13, wherein the gene from the 
20 group of genes is selected individually from gene No. 21 5 to gene No. 232 (SCC). 

17. The method according to any of the preceding claims 1-13, wherein the gene from the 
group of genes is selected individually from gene No. 233 to gene No. 446 (progression). 

25 18. The method according to any of the preceding claims 1-13, wherein the gene from the 
group of genes is selected individually from gene No. 447 to gene No. 562 (CIS). 

19. The method according to any of the preceding claims, wherein the expression level of at 
least two genes from the group of genes are determined. 

30 

20. The method according to any of the preceding claims, wherein the expression level of at 
least three genes from the group of genes are determined. 

21. The method according to any of the preceding claims, wherein the expression level of at 
35 least four genes from the group of genes are determined. 

22. The method according to any of the preceding claims, wherein the expression level of at 
least five genes from the group of genes are determined. 
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23. The method according to any of the preceding claims, wherein the expression level of 
more than six genes from the group of genes are determined. 

24. The method according to any of the preceding claims, wherein the difference in expres- 
5 sion level of a gene from the gene group to the at least one standard expression level is 

at least two-fold. 

25. The method according to any of the preceding claims, wherein the difference in expres- 
sion level of a gene from the gene group to the at least one standard expression is at 

1 0 least three-fold. 

26. The method according to any of the preceding claims, wherein the difference in expres- 
sion level of a gene from the gene group to the at least one standard expression is at 
least four-fold. 

15 

27. The method according to any of the preceding claims, wherein the expression level is 
determined by determining the mRNA of the cells. 

28. The method according to any of the claims 1-26, wherein the expression level is deter- 
20 mined by determining expression products, such as peptides, in the cells. 

29. The method according to claim 28, wherein the expression level is determined by deter- 
mining expression products, such as peptides, in the body fluids, such as blood, serum, 
plasma, faeces, mucus, sputum, cerebrospinal fluid, and/or urine. 

25 

30. The method according to any of the preceding claims, wherein the stage of the biological 
condition has been determined prior to the prediction of the prognosis. 

31. The method according to claim 30, wherein the stage of the biological condition has 
30 been determined by histological examination of the tissue or by genotyping of the tissue. 

32. The method according to claim 28 or 29, wherein the stage of the biological condition 
has been determined by genotyping of the tissue. 

35 33. The method according to claim 31 or 32, wherein the stage of the biological condition 
has been determined by 

determining the expression of at least a first stage gene from a first stage gene group 
and/or at least a second stage gene from a second stage gene group, wherein at least 
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one of said genes is expressed in said first stage of the condition in a higher amount 
than in said second stage, and the other gene is a expressed in said first stage of the 
condition in a lower amount than in said second stage of the condition, 

5 correlating the expression level of the assessed genes to a standard level of expression 

determining the stage of the condition. 

34. The method according to any of the preceding claims, wherein the expression level of at 
least two genes is determined, by 

10 

determining a first expression level of at least one gene from a first gene group, wherein 
the gene from the first gene group is selected from the group of gene No. 237, 238, 
239, 240, 241, 242, 243, 245, 246, 247, 248, 250, 253, 254, 257, 258, 260, 263, 
264, 265, 267, 270, 271, 272, 278, 283, 284, 287, 288, 290, 291, 292, 294, 297, 
15 298, 300, 302, 303, 305, 309, 310, 315, 316, 317, 318, 319, 321, 324, 329, 335, 

336, 337, 339, 340, 344, 346, 347, 354, 356, 358, 359, 362, 364, 365, 368, 369, 
371, 372, 377, 378, 379, 380, 381, 382, 383, 384, 388, 391, 393, 395, 396, 397, 
399, 402, 403, 404, 409, 413, 417, 419, 420, 421, 422, 423, 425, 427 ,429, 430, 
431 , 432, 437, 444 (progressorgener), and 

20 

determining a second expression level of at least one gene from a second gene group, 
wherein the second gene group is selected from the group of genes No. 233, 234, 235, 
236, 244, 249, 251 , 252, 255, 256, 259, 261 , 262, 266, 268, 269, 273, 274, 275, 
276, 277, 279, 280, 281, 282, 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 
25 307, 308, 311, 312, 313, 314 , 320 , 322, 323, 325, 326, 327, 328 , 330, 331, 

332, 333, 334, 338, 341, 342, 343, 345, 348, 349, 350, 351, 352, 353, 355, 357, 
360, 361, 363, 366, 367, 370, 373, 374, 375, 376, 385, 386, 387, 389, 390, 392, 
394, 398, 400, 401, 405, 406, 407, 408, 410, 411, 412, 414, 415, 416, 418, 424, 
426, 428, 433, 434, 435, 436, 438, 439, 440, 441, 442, 443, 445, 446 (non- 
30 progressorgener), and 

correlating the first expression level to a standard expression level for progressors, 
and/or the second expression level to a standard expression level for non-progressors to 
predict the prognosis of the biological condition in the animal tissue. 

35 

35. A method of determining the stage of a biological condition in animal tissue, 
comprising collecting a sample comprising cells from the tissue, 
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10 



20 



25 



determining an expression level of at least one gene selected from the group of genes 
consisting of geneNo 1 to gene No. 562 

correlating the expression level of the assessed genes to at least one standard level of 
expression determining the stage of the condition. 

36. The method according to claim 36. wherein the expression level of at least two genes is 
determined by 



determining the expression of at least a first stage gene from a first stage gene group 
and at least a second stage gene from a second stage gene group, wherein at least one 
of said genes is expressed in said first stage of the condition in a higher amount than in 
said second stage, and the other gene is a expressed in said first stage of the condition 
15 in a lower amount than in said second stage of the condition, and 

correlating the expression level of the assessed genes to a standard level of expression 
determining the stage of the condition 

37. The method according to claim 35 or 36. wherein the stage is selected from bladder 
cancer stages Ta, carcinoma in situ (CIS), T1, T2, T3 and T4. 

38. The method according to claim 37, comprising determining at least the expression of a 
Ta stage gene from a Ta stage gene group, at least one T1 stage gene from a T1 stage 
gene group, at least a T2 stage gene from a T2 stage gene group, at least a T3 stage 
gene from a T3 stage gene group, at least a T4 stage gene group from a T4 stage gene 
group, wherein at least one gene from each gene group is expressed in a significantly 
different amount in that stage than in one of the other stages. 

30 39. The method according to claim 38, wherein a Ta stage gene is selected individually from 
the group of Table B1 . 



40. The method according to claim 38, wherein a T1 stage gene is selected individually from 
the group of Table B2. 

35 

41. The method according to claim 38. wherein a T2 stage gene is selected individually from 
the group of Table B3. 
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42. The method according to any of claims 35-41 , said method comprising one or more of 
the features defined in any of the claims 1-34. 

43. A method of determining an expression pattern of a bladder cell sample, comprising: 

5 

collecting sample comprising bladder cells and/or expression products from bladder 
cells, 

determining the expression level of at least one gene in the sample, said gene being 
1 0 selected from the group of genes consisting of gene No. 1 to gene No. 562, and obtain- 

ing an expression pattern of the bladder cell sample. 



15 



44. The method according to claim 43, wherein the expression level of at least two genes 
are determined. 

45. The method according to claim 43, wherein the expression level of at least three genes 
are determined. 

46. The method according to claim 43, wherein the expression level of at least four genes 
20 are determined. 

47. The method according to claim 43, wherein the expression level of at least five genes 
are determined. 

25 48. The method according to claim 43, wherein the expression level of more than six genes 
are determined. 

49. The method of claims 43-48, wherein the genes exclude genes which are expressed in 
the submucosal, muscle, or connective tissue, whereby a pattern of expression is formed 

30 for the sample which is independent of the proportion of submucosal, muscle, or con- 

nective tissue cells in the sample. 

50. The method of claim 49, comprising determining the expression level of one or more 
genes in the sample comprising predominantly submucosal, muscle, and connective tis- 

35 sue cells, obtaining a second pattern, subtracting said second pattern from the expres- 

sion pattern of the bladder cell sample, forming a third pattern of expression, said third 
pattern of expression reflecting expression of the bladder mucosa or bladder cancer cells 
independent of the proportion of submucosal, muscle, and connective tissue cells pres- 
ent in the sample. 
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51 . The method of any of the preceding claims 43-50, wherein the sample is a biopsy of the 
tissue. 

5 52. The method according to any of the preceding claim 43-51, wherein the sample is a cell 
suspension. 



10 



35 



53. The method according to any of the preceding claims 43-52, wherein the sample com- 
prises substantially only cells from said tissue. 

54. The method according to claim 53, wherein the sample comprises substantially only cells 
from mucosa. 



55. A method of predicting the prognosis a biological condition in human bladder tissue 
15 comprising, 

collecting a sample comprising cells from the tissue, 

determining an expression pattern of the cells as defined in any of claims 43-54, 

20 

correlating the determined expression pattern to a reference pattern, 

predicting the prognosis of the biological condition of said tissue. 

25 56. A method for determining the stage of a biological condition in animal tissue 
comprising, 

collecting a sample comprising cells from the tissue, 
30 determining an expression pattern of the cells as defined in any of claims 43-54, 

correlating the determined expression pattern to a reference pattern, 
determining the stage of the biological condition is said tissue. 



57. A method for reducing cell tumorigenicity or malignancy of a cell, said method 
comprising 
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contacting a tumor cell with at least one peptide expressed by at least one gene selec 
from the group of genes consisting of gene Nos. 200-214, 233, 234, 235, 236, 244, 249, 25 i 
252, 255, 256, 259, 261, 262, 266, 268, 269, 273, 274, 275, 276, 277, 279, 280, 281, 282, 
285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 307, 308, 311, 312, 313, 314 , 320 , 322, 
5 323, 325, 326, 327, 328 , 330, 331 , 332, 333, 334, 338, 341 , 342, 343. 345, 348, 349, 350, 
351, 352, 353, 355, 357, 360, 361, 363. 366, 367, 370. 373. 374, 375, 376, 385, 386, 387. 
389, 390, 392, 394, 398, 400, 401, 405, 406, 407, 408, 410, 411, 412, 414, 415, 416, 418, 
424. 426, 428, 433, 434, 435, 436. 438, 439, 440, 441, 442. 443, 445, 446, 453, 460, 461, 
463. 464. 465, 466, 467, 469, 470, 471, 472, 473, 475, 476, 477. 479. 480, 481, 482, 483, 
1 0 485, 486, 487, 488, 490, 492. 494. 496, 497, 498 , 499, 503, 515. 516. 517, 521 , 526, 527, 
528, 530 ,532. 533, 537, 539, 540, 541, 542, 543, 545. 554, 557, 560, 

58. The method according to claim 57, wherein the tumor cell is contacted with at least two 
different peptides. 

15 

59. A method for reducing cell tumorigenicity of a cell, said method comprising 

obtaining at least one gene selected from the group of genes consisting of gene No. 200- 
214, 233, 234, 235, 236, 244, 249, 251, 252, 255. 256. 259. 261, 262, 266, 268, 269, 273. 

20 274, 275, 276, 277, 279, 280, 281, 282. 285, 286, 289, 293, 295, 296, 299, 301, 304, 306, 
307, 308, 311, 312, 313, 314 , 320 , 322, 323. 325. 326. 327, 328 , 330, 331, 332, 333, 334, 
338, 341, 342, 343, 345, 348, 349, 350, 351, 352, 353, 355, 357, 360, 361, 363, 366, 367, 
370. 373, 374, 375, 376, 385, 386. 387, 389. 390, 392, 394, 398, 400. 401, 405, 406, 407, 
408, 410, 411, 412, 414, 415, 416, 418, 424, 426, 428, 433, 434, 435. 436, 438. 439. 440. 

25 441 , 442. 443. 445. 446, 453, 460, 461 , 463. 464, 465, 466, 467, 469, 470, 471 , 472. 473, 
475, 476, 477, 479, 480, 481, 482, 483, 485, 486, 487, 488, 490, 492. 494, 496. 497. 498 , 
499, 503. 515. 516, 517, 521, 526, 527, 528, 530 ,532, 533, 537, 539, 540, 541, 542, 543, 
545, 554, 557. 560, 

30 introducing said at least one gene into the tumor cell in a manner allowing expression 

of said gene(s). 

60. The method according to claim 59, wherein at least one gene is introduced into the 
tumor cell. 

35 

61. The method according to claim 59 or 60, wherein at least two different genes are 
introduced into the tumor cell. 
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62. A method for reducing cell tumorigenicity or malignancy of a cell, said method 
comprising 

obtaining at least one nucleotide probe capable of hybridising with at least one gene of 
5 a tumor cell, said at least one gene being selected from the group of genes consisting 

of gene Nos. 1-199, 215-232, 237, 238, 239, 240, 241, 242, 243, 245, 246, 247, 248, 
250, 253, 254, 257, 258, 260, 263, 264, 265, 267, 270, 271, 272, 278, 283, 284, 287, 
288, 290, 291 , 292, 294, 297, 298, 300, 302, 303, 305, 309, 310, 315, 316, 317, 318, 
319, 321, 324, 329, 335, 336, 337, 339. 340, 344, 346, 347, 354, 356, 358, 359, 362, 

1 0 364, 365, 368, 369, 371 , 372, 377, 378, 379, 380, 381 , 382, 383, 384, 388, 391 , 393, 

395, 396, 397, 399, 402, 403, 404, 409, 413, 417, 419, 420, 421, 422, 423, 425, 427 
,429, 430, 431, 432, 437, 444, 447, 448, 449, 450, 451, 452, 454, 455 ,456, 457, 458, 
459, 462, 468, 474, 478, 484, 489, 491, 493, 495, 500, 501, 502, 504, 505, 506, 507, 
508, 509, 510, 511, 512, 513, 514, 518 , 519, 520, 522, 523, 524, 525, 529, 531, 534, 

15 535, 536, 538, 544, 546, 547, 548, 549, 550, 551, 552, 553, 555, 556, 558, 559, 561, 

562, 

introducing said at least one nucleotide probe into the tumor cell in a manner allowing 
the probe to hybridise to the at least one gene, thereby inhibiting expression of said at 
20 least one gene. 

63. The method according to claim 62, wherein at least one gene is introduced into the 
tumor cell. 

25 64. The method according to claim 62 or 63, wherein at least two different genes are 
introduced into the tumor ceil. 

65. A pharmaceutical composition for the treatment of a biological condition comprising at 
least one antibody against an expression product of a cell from a biological tissue 
30 produced by 

obtaining expression product(s) from at least one gene said gene being selected from 
the group of genes consisting of genes as defined in claim 62, 



35 



immunising a mammal with said expression product(s) obtaining antibodies against 
the expression product. 
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66. A pharmaceutical composition for the treatment of a biological condition comprising at 
least one peptide, said peptide being an expression product from a gene selected from 
the group consisting of genes Nos. 1-562 of or a fragment thereof. 

5 67. A vaccine for the prophylaxis or treatment of a biological condition comprising at least 
one expression product from at least one gene said gene being selected from the group 
of genes consisting of gene as defined in claim 62. 

68. Use of a method as defined in any of claims 1 -64 for producing an assay for diagnosing 
10 a biological condition in animal tissue. 

69. Use of a at least one expression product from at least one gene for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal tissue. 

1 5 70. Use of a gene, said gene being selected from the group of genes consisting of gene No. 

1 to gene No. 562, for the preparation of a pharmaceutical composition for the 
treatment of a biological condition in animal tissue. 

71 . Use of a probe as defined in any of claims 62-64 for the preparation of a pharmaceutical 
20 composition for the treatment of a biological condition in animal tissue. 

72. An assay for predicting the prognosis of a biological condition in animal tissue, 

comprising 

25 at least one first marker capable of detecting an expression level of at least one gene 

selected from the group of genes consisting of gene No. 1 to gene No. 562. 

73. The assay according to claim 72, wherein the marker is a nucleotide probe. 
30 74. The assay according to claim 72, wherein the marker is an antibody. 

75. The assay according to claim 72, comprising at least a first marker and/or a second 
marker, wherein the first marker is capable of detecting a gene from a first gene group 
as defined in claim 34, and/or the second marker is capable of detecting a gene from a 

35 second gene group as defined in claim 34. 

76. The assay according to any of claims 72-75, said assay further comprising means for 
correlating the expression level of the at least one gene to a standard expression level 
and/or a reference expression pattern. 
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